How many bytes in a JavaScript string?
Solution 1:
This function will return the byte size of any UTF-8 string you pass to it.
function byteCount(s) {
return encodeURI(s).split(/%..|./).length - 1;
}
Source
JavaScript engines are free to use UCS-2 or UTF-16 internally. Most engines that I know of use UTF-16, but whatever choice they made, it’s just an implementation detail that won’t affect the language’s characteristics.
The ECMAScript/JavaScript language itself, however, exposes characters according to UCS-2, not UTF-16.
Source
Solution 2:
You can use the Blob to get the string size in bytes.
Examples:
console.info(
new Blob(['😂']).size, // 4
new Blob(['👍']).size, // 4
new Blob(['😂👍']).size, // 8
new Blob(['👍😂']).size, // 8
new Blob(['I\'m a string']).size, // 12
// from Premasagar correction of Lauri's answer for
// strings containing lone characters in the surrogate pair range:
// https://stackoverflow.com/a/39488643/6225838
new Blob([String.fromCharCode(55555)]).size, // 3
new Blob([String.fromCharCode(55555, 57000)]).size // 4 (not 6)
);
Solution 3:
If you're using node.js, there is a simpler solution using buffers :
function getBinarySize(string) {
return Buffer.byteLength(string, 'utf8');
}
There is a npm lib for that : https://www.npmjs.org/package/utf8-binary-cutter (from yours faithfully)
Solution 4:
String
values are not implementation dependent, according the ECMA-262 3rd Edition Specification, each character represents a single 16-bit unit of UTF-16 text:
4.3.16 String Value
A string value is a member of the type String and is a finite ordered sequence of zero or more 16-bit unsigned integer values.
NOTE Although each value usually represents a single 16-bit unit of UTF-16 text, the language does not place any restrictions or requirements on the values except that they be 16-bit unsigned integers.
Solution 5:
Try this combination with using unescape js function:
const byteAmount = unescape(encodeURIComponent(yourString)).length
Full encode proccess example:
const s = "1 a ф № @ ®"; // length is 11
const s2 = encodeURIComponent(s); // length is 41
const s3 = unescape(s2); // length is 15 [1-1,a-1,ф-2,№-3,@-1,®-2]
const s4 = escape(s3); // length is 39
const s5 = decodeURIComponent(s4); // length is 11