Remember and Repopulate File Input [duplicate]
Note:
The answer(s) below reflect the state of legacy browsers in 2009. Now you can actually set the value of the file input element via JavaScript in 2017.
See the answer in this question for details as well as a demo:
How to set file input value programatically (i.e.: when drag-dropping files)?
I have a website that allows the user to upload a file multiple times for processing. At the moment I have a single file input but I want to be able to remember the users choice and show it on the screen.
What I want to know how to do is after a user selects a file I will remember their choice and redisplay the file input with the file pre-selected on reload of the page. All I need to know is how to remember and repopulate a file input.
I am also open to approaches that don't use a file input (if that is possible).
I am using JQuery
Ok, you want to "Remember and Repopulate File Input", "remember their choice and redisplay the file input with the file pre-selected on reload of the page"..
And in the comment to my previous answer you state that you're not really open to alternatives: "Sorry but no Flash and Applets, just javscript and/or file input, possibly drag and drop."
I noticed while browsing (quite some) duplicate questions (1, 2, 3, etc.), that virtually all other answers are along the lines of: "No you can't, that would be a security-issue", optionally followed by a simple conceptual or code example outlining the security-risk.
However, someone stubborn as a mule (not necessarily a bad thing up to a certain level) might perceive those answers as: "No, because I said so", which is indeed something different then: "No, and here are the specs that dis-allow it".
So this, is my third and last attempt to answer your question (I guided you to the watering-hole, I lead you to the river, now I'm pushing you to the source, but I can't make you drink).
Edit 3:
What you want to do was actually once described/'suggested' in RFC1867 Section 3.4:
The VALUE attribute might be used with
<INPUT TYPE=file>
tags for a default file name. This use is probably platform dependent. It might be useful, however, in sequences of more than one transaction, e.g., to avoid having the user prompted for the same file name over and over again.
And indeed, the HTML 4.01 spec section 17.4.1 specifies that:
User agents may use the value of the value attribute as the initial file name.
(By 'User agents' they mean 'browsers').
Given the facts that javascript can both modify and submit a form (including a file-input) and one could use css to hide forms/form-elements (like the file-input), the above statements alone would make it possible to silently upload files from a user's computer without his intention/knowledge.
It is clearly extremely important that this is not possible, and as such, (above) RFC1867 states in section 8 security Considerations:
It is important that a user agent not send any file that the user has not explicitly asked to be sent. Thus, HTML interpreting agents are expected to confirm any default file names that might be suggested with
<INPUT TYPE=file VALUE="yyyy">
.
However, the only browser (I'm aware of) that ever implemented this features was (some older versions of) Opera: it accepted a <input type="file" value="C:\foo\bar.txt>
or value set by javascript (elm_input_file.value='c:\\foo\\bar.txt';
).
When this file-box was unchanged upon form-submit, Opera would pop-up a security-window informing the user of what file(s) where about to be uploaded to what location (url/webserver).
Now one might argue that all other browsers were in violation of the spec, but that would be wrong: since the spec stated: "may
" (it did not say "must
") ".. use value attribute as the initial file name".
And, if the browser doesn't accept setting the file-input value (aka, having that value just be 'read-only') then the browser also would not need to pop-up such a 'scary' and 'difficult' security-pop-up (that might not even serve it's purpose if the user didn't understand it (and/or was 'conditioned' to always click 'OK')).
Let's fast-forward to HTML 5 then..
Here all this ambiguity is cleared up (yet it still takes some puzzling):
Under 4.10.7.1.18 File Upload state we can read in the bookkeeping details:
- The value IDL attribute is in mode filename.
...- The element's value attribute must be omitted.
So, a file-input's value attribute must be omitted, yet it also operates in some kind of 'mode' called 'filename' which is described in 4.10.7.4 Common input element APIs:
The value IDL attribute allows scripts to manipulate the value of an input element. The attribute is in one of the following modes, which define its behavior:
skipping to this 'mode filename':
On getting, it must return the string "C:\fakepath\" followed by the filename of the first file in the list of selected files, if any, or the empty string if the list is empty. On setting, if the new value is the empty string, it must empty the list of selected files; otherwise, it must throw an InvalidStateError exception.
Let me repeat that: "it must
throw an InvalidStateError exception" if one tries to set an file-input value to a string that is not empty !!! (But one can clear the input-field by setting it's value to an empty string.)
Thus, currently and in the foreseeable HTML5 future (and in the past, except Opera), only the user can populate a file-input (via the browser or os-supplied 'file-chooser'). One can not (re-)populate the file-input to a file/directory with javascript or by setting the default value.
Getting the filename/file-path
Now, suppose it was not impossible to (re-)populate a file-input with a default value, then obviously you'd need the full path: directory + filename(+ extension).
In the past, some browsers like (most notable) IE6 (up to IE8) did reveal the full path+filename as value: just a simple alert( elm_input_file.value );
etc. in javascript AND the browser also sent this full path+filename(+ extension) to the receiving server on form-submit.
Note: some browsers also have a 'file or fileName' attribute (usually sent to the server) but obviously this would not include a path..
That is a realistic security/privacy risk: a malicious website(owner/exploiter) could obtain the path to a users home-directory (where personal stuff, accounts, cookies, user-portion of registry, history, favorites, desktop etc. is located in known constant locations) when the typical non-tech windows-user will upload his files from: C:\Documents and Settings\[UserName]\My Documents\My Pictures\kinky_stuff\image.ext
.
I did not even talk about the risks while transmitting the data (even 'encrypted' via https) or 'safe' storage of this data!
As such, more and more alternative browsers were starting to follow one of the oldest proven security-measures: share information on a need-to-know basis.
And the vast majority of websites do not need to know the file-path, so they only revealed the filename(+ extension).
By the time IE8 was released, MS decided to follow the competition and added an URLAction option, called “Include local directory path when uploading files”, which was set to 'disabled' for the general internet-zone (and 'enabled' in the trusted zone) by default.
This change created a small havoc (mostly in 'optimized for IE' environments) where all kinds of both custom code and proprietary 'controls' couldn't get the filename of files that were uploaded: they were hard-coded to expect a string containing a full path and extract the part after the last backslash (or forward slash if you were lucky...). 1, 2
Along came HTML5,
and as you have read above, the 'mode filename' specifies:
On getting, it must return the string "C:\fakepath\" followed by the filename of the first file in the list of selected files, if any, or the empty string if the list is empty.
and they note that
This "fakepath" requirement is a sad accident of history
and
For historical reasons, the value IDL attribute prefixes the filename with the string "C:\fakepath\". Some legacy user agents actually included the full path (which was a security vulnerability). As a result of this, obtaining the filename from the value IDL attribute in a backwards-compatible way is non-trivial. The following function extracts the filename in a suitably compatible manner:
function extractFilename(path) { if (path.substr(0, 12) == "C:\\fakepath\\") return path.substr(12); // modern browser var x; x = path.lastIndexOf('/'); if (x >= 0) // Unix-based path return path.substr(x+1); x = path.lastIndexOf('\\'); if (x >= 0) // Windows-based path return path.substr(x+1); return path; // just the filename }
Note: I think this function is stupid: the whole point is to always have a fake windows-path to parse.. So the first 'if' is not only useless but even invites a bug: imagine a user with an older browser that uploads a file from: c:\fakepath\Some folder\file.ext
(as it would return: Some folder\file.ext
)...
I would simply use:
function extractFilename(s){
// returns string containing everything from the end of the string
// that is not a back/forward slash or an empty string on error
// so one can check if return_value===''
return (typeof s==='string' && (s=s.match(/[^\\\/]+$/)) && s[0]) || '';
}
(as the HTML5 spec clearly intended).
Let's recap (getting the path/file name):
- older browsers (and newer browsers where one could enable this as an option like IE>=8) will reveal a full windows/unix path
- less older browsers will not reveal any path, just a filename(+extension)
- current/future/HTML5-compliant browsers will always pre-pend the string:
c:\fakepath\
to the filename when getting the file-input's value
On top of that, they will only return the first filename (from a 'list of selected files') should the file-input accept multiple files and the user has selected multiple files.
Thus, in the recent past, currently and in the foreseeable HTML5 future one will usually only get the file-name.
That brings us to the last thing we need to examine: this 'list of selected files' / multiple-files, that leads us to the third part of the puzzle:
(HTML5) File API
First of all: the 'File API' should not be confused with the 'File System API', here is the abstract of the File System API:
This specification defines an API to navigate file system hierarchies, and defines a means by which a user agent may expose sandboxed sections of a user's local filesystem to web applications. It builds on [FILE-WRITER-ED], which in turn built on [FILE-API-ED], each adding a different kind of functionality.
The 'sandboxed sections of a user's local filesystem' already clearly indicates that one can't use this to get a hold of user-files outside of the sandbox (so not relevant to the question, although one could copy the user-selected file to the persistent local storage and re-upload that copy using AJAX etc. Useful as a 'retry' on failed upload.. But it wouldn't be a pointer to the original file that might have changed in the mean-time).
Even more important is the fact that only webkit (think older versions of chrome) implemented this feature and the spec is most probably not going to survive as it is no more actively maintained, the specification is abandonned for the moment as it didn't get any significant traction
Let's continue with the 'File API',
it's abstract tells us:
This specification provides an API for representing file objects in web applications, as well as programmatically selecting them and accessing their data. This includes:
- A FileList interface, which represents an array of individually selected files from the underlying system. The user interface for selection can be invoked via
<input type="file">
, i.e. when the input element is in the File Upload state [HTML] .- A Blob interface, which represents immutable raw binary data, and allows access to ranges of bytes within the Blob object as a separate Blob.
- A File interface, which includes readonly informational attributes about a file such as its name and the date of the last modification (on disk) of the file.
- A FileReader interface, which provides methods to read a File or a Blob, and an event model to obtain the results of these reads.
- A URL scheme for use with binary data such as files, so that they can be referenced within web applications.
So, FileList
can be populated by an input field in file-mode: <input type="file">
.
That means that all of the above about the value-attribute still applies!
When an input field is in file-mode, it gets a read-only attribute files
which is an array-like FileList object
that references the input-element's user-selected file(s) and is(/are) accessible by the FileList interface
.
Did I mention that the files
-attribute of the type FileList
is read-only (File API section 5.2) ? :
The HTMLInputElement interface [HTML] has a readonly attribute of type FileList...
Well, what about drag and drop?
From the mdn-documentation - Selecting files using drag and drop
The real magic happens in the drop() function:
function drop(e) { e.stopPropagation(); e.preventDefault(); var dt = e.dataTransfer; var files = dt.files; handleFiles(files); }
Here, we retrieve the dataTransfer field from the event, then pull the file list out of it, passing that to handleFiles(). From this point on, handling the files is the same whether the user used the input element or drag and drop.
So, (just like the input-field type="file",) the event's dataTransfer
attribute has an array-like attribute files
which is an array-like FileList object
and we have just learned (above) that the FileList is read-only..
The FileList contains references to the file(s) that a user selected (or dropped on a drop-target) and some attributes. From the File API Section 7.2 File Attributes we can read:
name
The name of the file; on getting, this must return the name of the file as a string. There are numerous file name variations on different systems; this is merely the name of the file, without path information. On getting, if user agents cannot make this information available, they must return the empty string.
lastModifiedDate
The last modified date of the file. On getting, if user agents can make this information available, this must return a new Date[HTML] object initialized to the last modified date of the file. If the last modification date and time are not known, the attribute must return the current date and time as a Date object.
and there is a size
attribute:
F.size is the same as the size of the fileBits Blob argument, which must be the immutable raw data of F.
Again, no path, just the read-only filename.
Thus:
-
(elm_input||event.dataTransfer).files
gives the FileList Object. -
(elm_input||event.dataTransfer).files.length
gives the number of files. -
(elm_input||event.dataTransfer).files[0]
is the first file selected. -
(elm_input||event.dataTransfer).files[0].name
is the file-name of the first file selected
(and this is thevalue
that is returned from an input type="file").
What about this 'URL scheme for use with binary data such as files, so that they can be referenced within web applications', surely that can hold an private reference to a file that a user selected?
From the File API - A URL for Blob and File reference we can learn that:
This specification defines a scheme with URLs of the sort:
blob:550e8400-e29b-41d4-a716-446655440000#aboutABBA.
These are stored in an URL store
(and browsers should even have their own mini HTTP-server aboard so one can use these urls in css, img src and even XMLHttpRequest.
One can create those Blob URL
s with:
-
var myBlobURL=window.URL.createFor(object);
returns aBlob URL
that is automatically revoked after it's first use. -
var myBlobURL=window.URL.createObjectURL(object, flag_oneTimeOnly);
returns a re-usableBlob URL
(unless the flag_oneTImeOnly evaluates to true) and can be revoked withwindow.URL.revokeObjectURL(myBlobURL)
.
Bingo you might think... however... the URL Store
is only maintained during a session (so it will survive a page-refresh, since it is still the same session) and lost when the document is unloaded.
From the MDN - Using object URLs:
The object URL is a string identifying the File object. Each time you call window.URL.createObjectURL(), a unique object URL is created, even if you've created an object URL for that file already. Each of these must be released. While they are released automatically when the document is unloaded, if your page uses them dynamically, you should release them explicitly by calling window.URL.revokeObjectURL()
That means, that even when you store the Blob URL
string in a cookie or persistent local storage, that string would be useless in a new session!
That should bring us to a full circle and the final conclusion:
It is not possible to (re-)populate an input-field or user-selected file (that is not in the browsers sandboxed 'Local storage' area).
(Unless you force your users to use an outdated version of Opera, or force your users to use IE and some activeX coding/modules (implementing a custom file-picker), etc)
Some further reading:
http://www.cs.tut.fi/~jkorpela/forms/file.html
https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications
http://www.html5rocks.com/en/tutorials/file/filesystem/
http://www.html5rocks.com/en/tutorials/file/dndfiles/
http://caniuse.com/filereader
JavaScript: The Definitive Guide - David Flanagan, chapter-22: The filesystem api
How to save the window.URL.createObjectURL() result for future use?
How long does a Blob persist?
How to resolve the C:\fakepath?
Create an input field on your form. When the user selects a file, copy the result to this field, something like:
jQuery('#inFile').change(
function(){ jQuery('#inCopy').val( jQuery('#inFile').val() ); }
);
Actually, the result is not copied exactly, instead it copies "C:/fakepath/SELECTED_FILE_NAME". While you are not allowed to set the value of a file input, you can set the value of the text input field, without the "C:/fakepath/", as the server prepares the form.
Now, when the server gets the form back, check the text input field. If it starts with "C:/fakepath/" then the user must have selected a new file, so upload their new selection. If it does not, then the user has opted for the previous selection, which should not be a problem since, according to the original question, the previous choice has been uploaded before and SHOULD (at least with appropriate programming, it COULD) still be on the server.