Get modification time of remote file over HTTP in Bash script
I'm creating a simple Bash script to extract the file modification time/date of a remote file via HTTP.
Example file: http://example.com/bar/example.pdf
Can this be done without downloading the actual file? If not, what's the best alternative?
Solution 1:
To be honest, not directly.
You will have to fetch data from the remote site to get information about the file.
Usually this is done with a HEAD
request, but some (most?) servers haven't implemented it correctly and deliver the whole file, just like doing a GET
request.
Assuming that you have curl
installed:
curl -s -v -X HEAD http://foo.com/bar/baz.pdf 2>&1 | grep '^< Last-Modified:'
might give you what you want, but as said, it highly depends on the server.
Solution 2:
The server response does usually have Last-Modified
field, you can check it without downloading the file. No need to use -X HEAD
, there's a special option -I
for that (the -s
suppresses progress output):
curl -sI http://example.com/bar/example.pdf | grep -i Last-Modified
Also in my case there's no curl installed (I'm doing a script for an embedded device), just wget
. The way with wget is:
wget --server-response --spider http://example.com/bar/example.pdf 2>&1 | grep -i Last-Modified
The --server-response
prints headers, and the --spider
option forces to not download pages, but rather check their existence.