Why does PHP store uploaded files in a temporary location and what is the benefit?
Okay I'm totally new in this field and going through some tutorials and I found that while uploading files in PHP it stores them in a temporary location.
$file_temp=$_FILES['file']['tmp_name'];
$file_loc="Upload".$file_name;
move_uploaded_files($file_temp,$file_loc);
Now why doesn't PHP allow uploading files directly to the desired location? Why they are stored in a temporary location with a .tmp extension and what benefit do we get from this strategy?
Solution 1:
Good question. The short answer is that PHP must process the entire HTTP request - filling out $_POST
with data, and $_FILES
as needed - before giving control to your script. Since your script doesn't gain control until after the processing, there's no way to tell PHP where to put that file data.
But why does PHP do it this way? Well, let's look at an HTTP POST with file data:
POST /upload?upload_progress_id=12344 HTTP/1.1
Host: localhost:3000
Content-Length: 1325
Origin: http://localhost:3000
... other headers ...
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryePkpFF7tjBAqx29L
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="MAX_FILE_SIZE"
100000
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="uploadedfile"; filename="hello.o"
Content-Type: application/x-object
... contents of file goes here ...
------WebKitFormBoundaryePkpFF7tjBAqx29L--
Notice that the contents of the request are a multi-part encoded document, with form fields interspersed among file data. In this particular example, the form field occurs before the file data. However, it's possible - indeed likely - that form data occurs after file data.
So, in order to guarantee that PHP can give you all the $_POST
data, PHP must process the entire request. So it might as well complete the $_FILES
super-global while it's there.
Now, PHP could keep this file data in memory, but this might really be a bad idea. Think about what would happen if PHP needed to store a 100 MiB file a user uploaded. Suddenly, you've got a 100 MiB increase in the RSS of your Apache process, which is really not too good - Apache might be ulimit
ed to not have that much space, or Apache might get swapped: to your users anguish. So, PHP does the next best thing: put this received file in a temporary file.
You might ask why PHP can't be told what file to put the incoming file data first, so you didn't have to move it. Well, that's a bootstrapping problem: PHP hasn't handed control over to the script yet, so the script can't tell PHP where to put the file. Thus, PHP does the best it can: put the file data into a temporary file.
Now, you can keep this file data in a RAM disk, for speed if you want. This is a good approach if you don't mind the infrastructure cost (eg, maintaining the RAM disk setup). But note this isn't like PHP holding it in RAM itself: in that scenario, the PHP container process (usually Apache or some other web server) must have the heap to hold the file (which it might not). In this scenario, the RAM disk is managed by the kernel.
Solution 2:
From What is the benefit of writing to a temp location, And then copying it to the intended destination?:
- On most platforms, file moves are atomic, but file writes are not (especially if you can't write all the data in one go). So if you have the typical producer/consumer pattern (one process produces files, the other watches a directory and picks up everything it finds), writing to a temp folder first and only then moving to the real location means the consumer can never see an unfinished file.
- If the process that writes the file dies halfway through, you have a broken file on your disk. If it's in a real location, you have to take care of cleaning it up yourself, but if it's in a temp location, the OS will take care of it. If the file happens to be created while a backup job is running, the job may pick up an incomplete file; temp directories are generally excluded from backups, so the file will only be included once moved to the final destination.
- The temp directory may be on a fast-but-volatile filesystem (e.g. a ramdisk), which can be beneficial for things like downloading several chunks of the same file in parallel, or doing in-place processing on the file with lots of seeks. Also, temp directories tend to cause more fragmentation than directories with less frequent reads, writes, and deletes, and keeping the temp directory on a separate partition can help keep fragmentation of the other partitions down.
Solution 3:
Two additional reasons:
If you decide not to accept the file for some reason, it's stored in a temporary location and presumably will be automatically deleted at some point.
Security. Let's say PHP was set to upload to a web-accessible directory like /images. Someone could upload some sort of hacking file and then execute it. By putting files in a temporary directory first (which will usually not be web-accessible), PHP lets you examine the file first. For instance, by processing images to remove any comments that could contain PHP code.