Which is the fastest way to copy 400G of files from an ec2 elastic block store volume to s3?

There are several key factors that determine throughput from EC2 to S3:

File size - smaller files require a larger number of requests and more overhead and transfer slower. The gain with filesize (when originating from EC2) is negligible for files larger than 256kB. (Whereas, transfering from a remote location, with higher latency, tends to continue showing appreciable improvements until between 1MiB and 2MiB).
Number of parallel threads - a single upload thread usually has a fairly low throughout - often below 5MiB/s. Throughput increases with the number of concurrent threads, and tends to peak between 64 and 128 threads. It should be noted that larger instances are able to handle a greater number of concurrent threads.
Instance size - As per the instance specifications, larger instances have more dedicated resources, including a larger (and less variable) allocation of network bandwidth (and I/O in general - including reading from ephemeral/EBS disks - which are network attached. Typical numbers values for each category are:
- Very High: Theoretical: 10Gbps = 1250MB/s; Realistic: 8.8Gbps = 1100MB/s
- High: Theoretical: 1Gbps = 125MB/s; Realistic: 750Mbps = 95MB/s
- Moderate: Theoretical: 250Mbps; Realistic: 80Mbps = 10MB/s
- Low: Theoretical: 100Mbps; Realistic: 10-15Mbps = 1-2MB/s

In cases of transferring large amounts of data, it may be economically practical to use a cluster compute instance, as the effective gain in throughput (>10x) is more than the difference in cost (2-3x).

While the above ideas are fairly logical (although, the per-thread cap may not be), it is quite easy to find benchmarks backing them up. One particularly detailed one can be found here.

Using between 64 and 128 parallel (simultaneous) uploads of 1MB objects should saturate the 1Gbps uplink that an m1.xlarge has and should even saturate the 10Gbps uplink of a cluster compute (cc1.4xlarge) instance.

While it is fairly easy to change instance size, the other two factors may be harder to manage.

File size is usually fixed - we cannot join files together on EC2 and have them split apart on S3 (so, there isn't much we can do about small files). Large files however, we can split apart on the EC2 side and reassemble on the S3 side (using S3's multi-part upload). Typically, this is advantageous for files that are larger than 100MB.
Parallel threads is a bit harder to cater to. The simplest approach comes down to writing a wrapper for some existing upload script that will run multiple copies of it at once. Better approaches use the API directly to accomplish something similar. Keeping in mind that the key is parallel requests, it is not difficult to locate several potential scripts, for example:
- s3cmd-modification - a fork of an early version of s3cmd that added this functionality, but hasn't been updated in several years.
- s3-parallel-put - reasonably recent python script that works well

So, after a lot of testing s3-parallel-put did the trick awesomely. Clearly the solution if you need to upload a lot of files to S3. Thanks to cyberx86 for the comments.

loop device in a Linux container?

Very strange file size (more than 600 PB) on a small filesystem

Suggestions for entries in a sysadmins .vimrc

How to Grant IIS 7.5 access to a certificate in certificate store?

Access to MySQL server via VirtualBox

Unable to connect to mysql through JDBC connector through Tomcat or externally

Allowing wildcard (%) access on MySQL db, getting error "access denied for '<user>'@'localhost'"

How to get HTTP preseed to work correctly on Ubuntu 10.04 LTS (Lucid)?

Approximately how much would it cost to make an AMI image from a EC2 Micro server?

If e-mail is only "best effort" delivery, is there a similar protocol with guaranteed delivery?

Trying to SSH in to remote computer but still asking for password

What is the difference between using upstream and location for php-fpm?