How can I read UploadFile in FastAPI?

I have a FastAPI endpoint that receives a file, uploads it to s3, and then processes it. Everything works fine except for the processing, that fails with this message:

  File "/usr/local/lib/python3.9/site-packages/starlette/datastructures.py", line 441, in read
    return self.file.read(size)
  File "/usr/local/lib/python3.9/tempfile.py", line 735, in read
    return self._file.read(*args)
ValueError: I/O operation on closed file.

My simplified code looks like this:

async def process(file: UploadFile):
    reader = csv.reader(iterdecode(file.file.read(), "utf-8"), dialect="excel")  # This fails!
    datarows = []
    for row in reader:
        datarows.append(row)
    return datarows

How can I read the contents of the uploaded file?

UPDATE

I managed to isolate the problem a bit more. Here's my simplified endpoint:

import boto3
from loguru import logger
from botocore.exceptions import ClientError


UPLOAD = True

@router.post("/")
async def upload(file: UploadFile = File(...)):
    if UPLOAD:
        # Upload the file
        s3_client = boto3.client("s3", endpoint_url="http://localstack:4566")
        try:
            s3_client.upload_fileobj(file.file, "local", "myfile.txt")
        except ClientError as e:
            logger.error(e)
    contents = await file.read()
    return JSONResponse({"message": "Success!"})

If UPLOAD is True, I get the error. If it's not, everything works fine. It seems boto3 is closing the file after uploading it. Is there any way I can reopen the file? Or send a copy to upload_fileobj?


Solution 1:

From FastAPI ImportFile:

Import File and UploadFile from fastapi:

from fastapi import FastAPI, File, UploadFile

app = FastAPI()


@app.post("/files/")
async def create_file(file: bytes = File(...)):
    return {"file_size": len(file)}


@app.post("/uploadfile/")
async def create_upload_file(file: UploadFile = File(...)):
    return {"filename": file.filename}

From FastAPI UploadFile:

For example, inside of an async path operation function you can get the contents with:

contents = await myfile.read()

with your code you should have something like this:

async def process(file: UploadFile = File(...)):
    content = await file.read()
    reader = csv.reader(iterdecode(content, "utf-8"), dialect="excel")
    datarows = []
    for row in reader:
        datarows.append(row)
    return datarows

Solution 2:

As per FastAPI's documentation, UploadFile uses Python's SpooledTemporaryFile, a "file stored in memory up to a maximum size limit, and after passing this limit it will be stored in disk.". It "operates exactly as TemporaryFile", which "is destroyed as soon as it is closed (including an implicit close when the object is garbage collected)". It seems that, once the contents of the file have been read, the file gets closed, which, in turn, causes the file to be deleted.

Option 1 would be to read the file contents, as you already do (i.e., contents = await file.read()), and then upload these bytes to your server, instead of file object (if that's possible).

Option 2 would be to copy the contents of the file into a NamedTemporaryFile, which, unlike TemporaryFile, "has a visible name in the file system" that "can be used to open the file". Additionally, it can remain accesible after it is closed, by setting the delete parameter to False; thus, allowing the file to reopen when needed. Once you have finished with it, you can manually delete it using the remove() or unlink() method. Below is a working example (inspired by this answer):

import uvicorn
from fastapi import FastAPI, File, UploadFile
from tempfile import NamedTemporaryFile
import os


app = FastAPI()

@app.post("/uploadfile/")
async def upload_file(file: UploadFile = File(...)):
    contents = await file.read()

    file_copy = NamedTemporaryFile(delete=False)
    try:
        file_copy.write(contents);  # copy the received file data into a new temp file. 
        file_copy.seek(0)  # move to the beginning of the file
        print(file_copy.read(10))
        
        # Here, upload the file to your S3 service

    finally:
        file_copy.close()  # Remember to close any file instances before removing the temp file
        os.unlink(file_copy.name)  # unlink (remove) the file from the system's Temp folder
    
    # print(contents)  # Handle file contents as desired
    return {"filename": file.filename}
    

if __name__ == '__main__':
    uvicorn.run(app, host='127.0.0.1', port=8000, debug=True)

Update

In case the file needs to reopen, before it is closed (while it is being uploaded to your server), and your platform does not allow that (as described here), then use the following code isntead (temp file's path is accessed by using file_copy.name):

@app.post("/uploadfile/")
async def upload_file(file: UploadFile = File(...)):
    contents = await file.read()

    file_copy = NamedTemporaryFile('wb', delete=False)
    f = None
    try:
        # The 'with' block ensures that the file closes and data are stored
        with file_copy as f:
            f.write(contents);
        
        # Here, upload the file to your S3 service
        # You can reopen the file as many times as desired. 
        f = open(file_copy.name, 'rb')
        print(f.read(10))

    finally:
        if f is not None:
            f.close() # Remember to close any file instances before removing the temp file
        os.unlink(file_copy.name)  # unlink (remove) the file from the system's Temp folder
    
    # Handle file contents as desired
    # print(contents)
    return {"filename": file.filename}