AWS Lambda and S3 - uploaded pdf file is blank

I have a pretty simple function that uploads a PDF file to an AWS S3 (https://codedestine.com/aws-s3-putobject-java/) using AWS Lambda with Amazon API Gateway.

I try to upload a PDF file which has 2 pages with text. After upload, the PDF file(on AWS S3) has 2 blank pages.

This is the method I use to upload the PDF file on AWS S3.

public static void uploadFile2(MultipartFile mpFile, String fileName) throws IOException{
   
    String dirPath = System.getProperty("java.io.tmpdir", "/tmp");
    File file = new File(dirPath  + "/" + fileName);

    OutputStream ops = new FileOutputStream(file);
    ops.write(mpFile.getBytes());

    s3client.putObject("fakebucketname", fileName, file);

}

Why the uploaded PDF file is blank?


Turns out that this will do this trick. Its all about encoding, thanks to the help of @KunLun. In my scenario, file is the multipart file (pdf) that is passed to aws via a POST to the url.

  • server gets a file with this byte -> 0010 (this will not be interpreted right, because a standard byte has 8 bits)
  • so, we encode it in base 64 -> doesn't matter what result, decode it to get a standard byte -> 0000 0010 (now this is a standard byte and is interpreted right by aws)
  • This source here helped a lot as well --> https://www.javaworld.com/article/3240006/base64-encoding-and-decoding-in-java-8.html?page=2

            Base64.Encoder enc = Base64.getEncoder();
            byte[] encbytes = enc.encode(file.getBytes());
            for (int i = 0; i < encbytes.length; i++)
            {
                System.out.printf("%c", (char) encbytes[i]);
                if (i != 0 && i % 4 == 0)
                    System.out.print(' ');
            }
            Base64.Decoder dec = Base64.getDecoder();
            byte[] barray2 = dec.decode(encbytes);
            InputStream fis = new ByteArrayInputStream(barray2);
    
            PutObjectResult objectResult = s3client.putObject("xxx", 
            file.getOriginalFilename(), fis, data);


Another very important piece to include is that the API Gateway settings must be properly configured to support binary data types. AWS Console --> API Gateway --> Settings --> include what I have below in the attached photo


You're using an output stream as input to the upload request. Just use File, and include content type, for example:

File file = new File(fileName);
PutObjectRequest request = new PutObjectRequest("bucketname", "keyname", file);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType("application/pdf");
request.setMetadata(metadata);
s3Client.putObject(request);