Decoding hex-containing escape sequences in JavaScript strings

I have a string in JS in this format:

http\x3a\x2f\x2fwww.url.com

How can I get the decoded string out of this? I tried unescape(), string.decode but it doesn't decode this. If I display that encoded string in the browser it looks fine (http://www.url.com), but I want to manipulate this string before displaying it.

Thanks.


Solution 1:

You could write your own replacement method:

String.prototype.decodeEscapeSequence = function() {
    return this.replace(/\\x([0-9A-Fa-f]{2})/g, function() {
        return String.fromCharCode(parseInt(arguments[1], 16));
    });
};
"http\\x3a\\x2f\\x2fwww.example.com".decodeEscapeSequence()

Solution 2:

There is nothing to decode here. \xNN is an escape character in JavaScript that denotes the character with code NN. An escape character is simply a way of specifying a string - when it is parsed, it is already "decoded", which is why it displays fine in the browser.

When you do:

var str = 'http\x3a\x2f\x2fwww.url.com';

it is internally stored as http://www.url.com. You can manipulate this directly.

Solution 3:

If you already have:

var encodedString = "http\x3a\x2f\x2fwww.url.com";

Then decoding the string manually is unnecessary. The JavaScript interpreter would already be decoding the escape sequences for you, and in fact double-unescaping can cause your script to not work properly with some strings. If, in contrast, you have:

var encodedString = "http\\x3a\\x2f\\x2fwww.url.com";

Those backslashes would be considered escaped (therefore the hex escape sequences remain unencoded), so keep reading.

Easiest way in that case is to use the eval function, which runs its argument as JavaScript code and returns the result:

var decodedString = eval('"' + encodedString + '"');

This works because \x3a is a valid JavaScript string escape code. However, don't do it this way if the string does not come from your server; if so, you would be creating a new security weakness because eval can be used to execute arbitrary JavaScript code.

A better (but less concise) approach would be to use JavaScript's string replace method to create valid JSON, then use the browser's JSON parser to decode the resulting string:

var decodedString = JSON.parse('"' + encodedString.replace(/([^\\]|^)\\x/g, '$1\\u00') + '"');

// or using jQuery
var decodedString = $.parseJSON('"' + encodedString.replace(/([^\\]|^)\\x/g, '$1\\u00') + '"');

Solution 4:

You don't need to decode it. You can manipulate it safely as it is:

var str = "http\x3a\x2f\x2fwww.url.com";
​alert(str.charAt(4));  // :
alert("\x3a" === ":"); // true
alert(str.slice(0,7))​; // http://