Upload file bigger than 40MB to Google App Engine?

I am creating a Google App Engine web app to "transform" files of 10K~50M

Scenario:

  1. User opens http://fixdeck.appspot.com in web browser
  2. User clicks on "Browse", select file, submits
  3. Servlet loads file as an InputStream
  4. Servlet transforms file
  5. Servlet saves file as an OutputStream
  6. The user's browser receives the transformed file and asks where to save it, directly as a response to the request in step 2

(For now I did not implement step 4, the servlet sends the file back without transforming it.)

Problem: It works for 15MB files but not for a 40MB file, saying: "Error: Request Entity Too Large. Your client issued a request that was too large."

Is there any workaround against this?

Source code: https://github.com/nicolas-raoul/transdeck
Rationale: http://code.google.com/p/ankidroid/issues/detail?id=697


Solution 1:

GAE has a hard limits of 32MB for HTTP requests and HTTP responses. That will limit the size of uploads/downloads directly to/from a GAE app.

Revised Answer (Using Blobstore API.)

Google provides to the Blobstore API for handling larger files in GAE (up to 2GB). The overview documentation provides complete sample code. Your web form will upload the file to blobstore. The blobstore API then rewrites the POST back to your servlet where you can do your transformation and save the transformed data back in to the blobstore (as a new blob).

Original Answer (Didn't Consider Blobstore as an option.)

For downloading, I think GAE only workaround would be to break the file up in to multiple parts on the server, and then reassemble after downloading. That's probably not doable using a straight browser implementation though.

(As an alternative design, perhaps you could send the transformed file from GAE to an external download location (such as S3) where it could be downloaded by the browser without the GAE limit restrictions. I don't believe GAE initiated connections have same request/response size limitations, but I'm not positive. Regardless, you would still be restricted by the 30 second maximum request time. To get around that, you'd have to look in to GAE Backend instances and come up with some sort of asynchronous download strategy.)

For uploading larger files, I've read about the possibility of using HTML5 File APIs to slice the file in to multiple chunks for uploading, and then reconstructing on the server. Example: http://www.html5rocks.com/en/tutorials/file/dndfiles/#toc-slicing-files . However, I don't how practical a solution that really is due to changing specifications and browser capabilities.

Solution 2:

You can use the blobstore to upload files as large as 2 gigabytes.