WiX ExePackage: Failed to send request to URL

Using WiX Burn v3.7.1224, I failed acquiring a remote payload through ExePackage. I usually succeed with similar ExePackage elements so I suppose the problem may be related to the specific URL I'm trying to download.

This specific "exe URL" is http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe

To be precise: the interactive page is at http://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe and clicking there on the download anchor "seems" to result in the direct "exe URL" listed above. I say "seems" because I had to dive into the page source code to find out the final "exe URL", and this may be a problem.

Here is the WiX fragment of interest:

<Fragment>
    <util:RegistrySearch Id="TesseractLookup"
                         Variable="TESSERACT_REGVALUE"
                         Root="HKLM"
                         Key="SOFTWARE\Tesseract-OCR" 
                         Value="CurrentVersion" />   

    <PackageGroup Id="Tesseract">
        <ExePackage Compressed="no" 
                    PerMachine="yes" 
                    Permanent="yes" 
                    Vital="yes" 
                    Name="redist\tesseract-ocr-setup-3.02.02.exe"                    
                    InstallCondition="NOT TESSERACT_REGVALUE"
                    DetectCondition="TESSERACT_REGVALUE"
                    DownloadUrl="http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe">

            <RemotePayload Description="Tesseract-OCR - open source OCR engine" 
                           Hash="35C61604AAAC961C24CD28F959566B2E39244541" 
                           ProductName="Tesseract-OCR" 
                           Size="13525781"
                           Version="3.02.02.0" />
        </ExePackage>
    </PackageGroup>        
</Fragment>

I succeeded with the browsers I tried (Firefox and Internet Explorer) and with a basic "wget" command as well, in a matter of seconds. But failed with Burn. All were provided with the same "exe URL". I even tried with disabled firewall and anti-virus softwares just in case, but to no avail.

Do you have any hint about what might be happening?

Here are the relevant lines from the install log:

[27F8:1FE8][2013-03-07T08:36:46]w343: Prompt for source of package: tesseract_ocr_setup_3.02.02.exe, payload: tesseract_ocr_setup_3.02.02.exe, path: D:\soft\audiveris\dist\redist\tesseract-ocr-setup-3.02.02.exe
[27F8:1FE8][2013-03-07T08:36:46]i338: Acquiring package: tesseract_ocr_setup_3.02.02.exe, payload: tesseract_ocr_setup_3.02.02.exe, download from: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe
[27F8:1FE8][2013-03-07T08:36:47]e000: Error 0x80070002: Failed to send request to URL: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe
[27F8:1FE8][2013-03-07T08:36:47]e000: Error 0x80070002: Failed to connect to URL: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe
[27F8:1FE8][2013-03-07T08:36:47]e000: Error 0x80070002: Failed to get size and time for URL: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe
[27F8:1FE8][2013-03-07T08:36:47]e000: Error 0x80070002: Failed attempt to download URL: 'http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe' to: 'C:\Users\herve\AppData\Local\Temp\{7715fbb6-5bc5-442f-86a0-655fa082bd7d}\tesseract_ocr_setup_3.02.02.exe'
[27F8:1FE8][2013-03-07T08:36:47]e000: Error 0x80070002: Failed to acquire payload from: 'http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe' to working path: 'C:\Users\herve\AppData\Local\Temp\{7715fbb6-5bc5-442f-86a0-655fa082bd7d}\tesseract_ocr_setup_3.02.02.exe'

Solution 1:

Following Rob suggestion (Thanks!), I installed and investigated with Fiddler. The HTTP request is actually a HEAD request and server is returning 404.

I pulled another software from another site, still with Burn, in order to discover what Fiddler should see: It's a HEAD request followed by a GET request. This makes sense: To my limited knowledge, HEAD is like a GET but with no content returned, and is used primarily for adjustments before launching the actual transfer.

Then, going back to Tesseract site, I forged with Fiddler a brand new HEAD request as follows:

HEAD http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe HTTP/1.1
Accept: */*
User-Agent: Burn
Host: tesseract-ocr.googlecode.com
Connection: Keep-Alive
Pragma: no-cache

Response was: "HTTP/1.1 404 Not Found" (just like the response to initial Burn request)

I then forged a GET request (just replacing "HEAD" by "GET"):

GET http://tesseract-ocr.googlecode.com/files/tesseract-ocr-setup-3.02.02.exe HTTP/1.1
Accept: */*
User-Agent: Burn
Host: tesseract-ocr.googlecode.com
Connection: Keep-Alive
Pragma: no-cache

And this time, I got: "HTTP/1.1 200 OK", followed by the 13525781 bytes of content... BINGO!

So now, we are left with:

  1. How can a server respond 404 to a HEAD request but correctly respond to the similar GET request?

  2. If Tesseract Google site has problems with HEAD requests, is there a way to tell Burn to skip the HEAD request and proceed directly with a GET request?

  3. Any other fix?