how do I access XHR responseBody (for binary data) from Javascript in IE?

I've got a web page that uses XMLHttpRequest to download a binary resource.

In Firefox and Gecko I can use responseText to get the bytes, even if the bytestream includes binary zeroes. I may need to coerce the mimetype with overrideMimeType() to make that happen. In IE, though, responseText doesn't work, because it appears to terminate at the first zero. If you read 100,000 bytes, and byte 7 is a binary zero, you will be able to access only 7 bytes. IE's XMLHttpRequest exposes a responseBody property to access the bytes. I've seen a few posts suggesting that it's impossible to access this property in any meaningful way directly from Javascript. This sounds crazy to me.

xhr.responseBody is accessible from VBScript, so the obvious workaround is to define a method in VBScript in the webpage, and then call that method from Javascript. See jsdap for one example. EDIT: DO NOT USE THIS VBScript!!

var IE_HACK = (/msie/i.test(navigator.userAgent) && 
               !/opera/i.test(navigator.userAgent));   

// no no no!  Don't do this! 
if (IE_HACK) document.write('<script type="text/vbscript">\n\
     Function BinaryToArray(Binary)\n\
         Dim i\n\
         ReDim byteArray(LenB(Binary))\n\
         For i = 1 To LenB(Binary)\n\
             byteArray(i-1) = AscB(MidB(Binary, i, 1))\n\
         Next\n\
         BinaryToArray = byteArray\n\
     End Function\n\
</script>'); 

var xml = (window.XMLHttpRequest) 
    ? new XMLHttpRequest()      // Mozilla/Safari/IE7+
    : (window.ActiveXObject) 
      ? new ActiveXObject("MSXML2.XMLHTTP")  // IE6
      : null;  // Commodore 64?


xml.open("GET", url, true);
if (xml.overrideMimeType) {
    xml.overrideMimeType('text/plain; charset=x-user-defined');
} else {
    xml.setRequestHeader('Accept-Charset', 'x-user-defined');
}

xml.onreadystatechange = function() {
    if (xml.readyState == 4) {
        if (!binary) {
            callback(xml.responseText);
        } else if (IE_HACK) {
            // call a VBScript method to copy every single byte
            callback(BinaryToArray(xml.responseBody).toArray());
        } else {
            callback(getBuffer(xml.responseText));
        }
    }
};
xml.send('');

Is this really true? The best way? copying every byte? For a large binary stream that's not going to be very efficient.

There is also a possible technique using ADODB.Stream, which is a COM equivalent of a MemoryStream. See here for an example. It does not require VBScript but does require a separate COM object.

if (typeof (ActiveXObject) != "undefined" && typeof (httpRequest.responseBody) != "undefined") {
    // Convert httpRequest.responseBody byte stream to shift_jis encoded string
    var stream = new ActiveXObject("ADODB.Stream");
    stream.Type = 1; // adTypeBinary
    stream.Open ();
    stream.Write (httpRequest.responseBody);
    stream.Position = 0;
    stream.Type = 1; // adTypeBinary;
    stream.Read....          /// ???? what here
}

But that's not going to work well - ADODB.Stream is disabled on most machines these days.

In The IE8 developer tools - the IE equivalent of Firebug - I can see the responseBody is an array of bytes and I can even see the bytes themselves. The data is right there. I don't understand why I can't get to it.

Is it possible for me to read it with responseText?

hints? (other than defining a VBScript method)

Solution 1:

Yes, the answer I came up with for reading binary data via XHR in IE, is to use VBScript injection. This was distasteful to me at first, but, I look at it as just one more browser dependent bit of code. (The regular XHR and responseText works fine in other browsers; you may have to coerce the mime type with XMLHttpRequest.overrideMimeType(). This isn't available on IE).

This is how I got a thing that works like responseText in IE, even for binary data. First, inject some VBScript as a one-time thing, like this:

if(/msie/i.test(navigator.userAgent) && !/opera/i.test(navigator.userAgent)) {
    var IEBinaryToArray_ByteStr_Script =
    "<!-- IEBinaryToArray_ByteStr -->\r\n"+
    "<script type='text/vbscript' language='VBScript'>\r\n"+
    "Function IEBinaryToArray_ByteStr(Binary)\r\n"+
    "   IEBinaryToArray_ByteStr = CStr(Binary)\r\n"+
    "End Function\r\n"+
    "Function IEBinaryToArray_ByteStr_Last(Binary)\r\n"+
    "   Dim lastIndex\r\n"+
    "   lastIndex = LenB(Binary)\r\n"+
    "   if lastIndex mod 2 Then\r\n"+
    "       IEBinaryToArray_ByteStr_Last = Chr( AscB( MidB( Binary, lastIndex, 1 ) ) )\r\n"+
    "   Else\r\n"+
    "       IEBinaryToArray_ByteStr_Last = "+'""'+"\r\n"+
    "   End If\r\n"+
    "End Function\r\n"+
    "</script>\r\n";

    // inject VBScript
    document.write(IEBinaryToArray_ByteStr_Script);
}

The JS class I'm using that reads binary files exposes a single interesting method, readCharAt(i), which reads the character (a byte, really) at the i'th index. This is how I set it up:

// see doc on http://msdn.microsoft.com/en-us/library/ms535874(VS.85).aspx
function getXMLHttpRequest() 
{
    if (window.XMLHttpRequest) {
        return new window.XMLHttpRequest;
    }
    else {
        try {
            return new ActiveXObject("MSXML2.XMLHTTP"); 
        }
        catch(ex) {
            return null;
        }
    }
}

// this fn is invoked if IE
function IeBinFileReaderImpl(fileURL){
    this.req = getXMLHttpRequest();
    this.req.open("GET", fileURL, true);
    this.req.setRequestHeader("Accept-Charset", "x-user-defined");
    // my helper to convert from responseBody to a "responseText" like thing
    var convertResponseBodyToText = function (binary) {
        var byteMapping = {};
        for ( var i = 0; i < 256; i++ ) {
            for ( var j = 0; j < 256; j++ ) {
                byteMapping[ String.fromCharCode( i + j * 256 ) ] =
                    String.fromCharCode(i) + String.fromCharCode(j);
            }
        }
        // call into VBScript utility fns
        var rawBytes = IEBinaryToArray_ByteStr(binary);
        var lastChr = IEBinaryToArray_ByteStr_Last(binary);
        return rawBytes.replace(/[\s\S]/g,
                                function( match ) { return byteMapping[match]; }) + lastChr;
    };

    this.req.onreadystatechange = function(event){
        if (that.req.readyState == 4) {
            that.status = "Status: " + that.req.status;
            //that.httpStatus = that.req.status;
            if (that.req.status == 200) {
                // this doesn't work
                //fileContents = that.req.responseBody.toArray(); 

                // this doesn't work
                //fileContents = new VBArray(that.req.responseBody).toArray(); 

                // this works...
                var fileContents = convertResponseBodyToText(that.req.responseBody);

                fileSize = fileContents.length-1;
                if(that.fileSize < 0) throwException(_exception.FileLoadFailed);
                that.readByteAt = function(i){
                    return fileContents.charCodeAt(i) & 0xff;
                };
            }
            if (typeof callback == "function"){ callback(that);}
        }
    };
    this.req.send();
}

// this fn is invoked if non IE
function NormalBinFileReaderImpl(fileURL){
    this.req = new XMLHttpRequest();
    this.req.open('GET', fileURL, true);
    this.req.onreadystatechange = function(aEvt) {
        if (that.req.readyState == 4) {
            if(that.req.status == 200){
                var fileContents = that.req.responseText;
                fileSize = fileContents.length;

                that.readByteAt = function(i){
                    return fileContents.charCodeAt(i) & 0xff;
                }
                if (typeof callback == "function"){ callback(that);}
            }
            else
                throwException(_exception.FileLoadFailed);
        }
    };
    //XHR binary charset opt by Marcus Granado 2006 [http://mgran.blogspot.com] 
    this.req.overrideMimeType('text/plain; charset=x-user-defined');
    this.req.send(null);
}

The conversion code was provided by Miskun.

Very fast, works great.

I used this method to read and extract zip files from Javascript, and also in a class that reads and displays EPUB files in Javascript. Very reasonable performance. About half a second for a 500kb file.

Solution 2:

XMLHttpRequest.responseBody is a VBArray object containing the raw bytes. You can convert these objects to standard arrays using the toArray() function:

var data = xhr.responseBody.toArray();

Solution 3:

I would suggest two other (fast) options:

First, you can use ADODB.Recordset to convert the byte array into a string. I would guess that this object is more common that ADODB.Stream, which is often disabled for security reasons. This option is VERY fast, less than 30ms for a 500kB file.
Second, if the Recordset component is not accessible, there is a trick to access the byte array data from Javascript. Send your xhr.responseBody to VBScript, pass it through any VBScript string function such as CStr (takes no time), and return it to JS. You will get a weird string with bytes concatenated into 16-bit unicode (in reverse). You can then convert this string quickly into a usable bytestring through a regular expression with dictionary-based replacement. Takes about 1s for 500kB.

For comparison, the byte-by-byte conversion through loops takes several minutes for this same 500kB file, so it's a no-brainer :) Below the code I have been using, to insert into your header. Then call the function ieGetBytes with your xhr.responseBody.

<!--[if IE]>    
<script type="text/vbscript">

    'Best case scenario when the ADODB.Recordset object exists
    'We will do the existence test in Javascript (see after)
    'Extremely fast, about 25ms for a 500kB file
    Function ieGetBytesADO(byteArray)
        Dim recordset
        Set recordset = CreateObject("ADODB.Recordset")
        With recordset
            .Fields.Append "temp", 201, LenB(byteArray)
            .Open
            .AddNew
            .Fields("temp").AppendChunk byteArray
            .Update
        End With
        ieGetBytesADO = recordset("temp")
        recordset.Close
        Set recordset = Nothing
    End Function

    'Trick to return a Javascript-readable string from a VBScript byte array
    'Yet the string is not usable as such by Javascript, since the bytes
    'are merged into 16-bit unicode characters. Last character missing if odd length.
    Function ieRawBytes(byteArray)
        ieRawBytes = CStr(byteArray)
    End Function

    'Careful the last character is missing in case of odd file length
    'We Will call the ieLastByte function (below) from Javascript
    'Cannot merge directly within ieRawBytes as the final byte would be duplicated
    Function ieLastChr(byteArray)
        Dim lastIndex
        lastIndex = LenB(byteArray)
        if lastIndex mod 2 Then
            ieLastChr = Chr( AscB( MidB( byteArray, lastIndex, 1 ) ) )
        Else
            ieLastChr = ""
        End If
    End Function

</script>

<script type="text/javascript">
    try {   
        // best case scenario, the ADODB.Recordset object exists
        // we can use the VBScript ieGetBytes function to transform a byte array into a string
        var ieRecordset = new ActiveXObject('ADODB.Recordset');
        var ieGetBytes = function( byteArray ) {
            return ieGetBytesADO(byteArray);
        }
        ieRecordset = null;

    } catch(err) {
        // no ADODB.Recordset object, we will do the conversion quickly through a regular expression

        // initializes for once and for all the translation dictionary to speed up our regexp replacement function
        var ieByteMapping = {};
        for ( var i = 0; i < 256; i++ ) {
            for ( var j = 0; j < 256; j++ ) {
                ieByteMapping[ String.fromCharCode( i + j * 256 ) ] = String.fromCharCode(i) + String.fromCharCode(j);
            }
        }

        // since ADODB is not there, we replace the previous VBScript ieGetBytesADO function with a regExp-based function,
        // quite fast, about 1.3 seconds for 500kB (versus several minutes for byte-by-byte loops over the byte array)
        var ieGetBytes = function( byteArray ) {
            var rawBytes = ieRawBytes(byteArray),
                lastChr = ieLastChr(byteArray);

            return rawBytes.replace(/[\s\S]/g, function( match ) {
                return ieByteMapping[match]; }) + lastChr;
        }
    }
</script>
<![endif]-->

Convert pandas DataFrame to a nested dict

LinkedList put into Intent extra gets recast to ArrayList when retrieving in next activity

Generic parameter 'T' could not be inferred when using a guard statement to unwrap a generic function

Issues with pyenv-virtualenv: Python and PIP not changed when activating / deactivating virtual environment

unexpected result while using findIndex array method

Python comparison of enums?

Inside "row row-cols-*", width determination for "col" when there's sibling "col-*"

Using `itertools` to combine DataFrame columns

Laravel PDOException SQLSTATE[HY000] [1049] Unknown database 'forge'

Recursively flatten a deeply-nested mix of objects and arrays

How to find the lowest and highest date based on certain value in another column?

Enabling passphrase on an already existing private key