Get modification time of remote file over HTTP in Bash script

I'm creating a simple Bash script to extract the file modification time/date of a remote file via HTTP.

Example file: http://example.com/bar/example.pdf

Can this be done without downloading the actual file? If not, what's the best alternative?


Solution 1:

To be honest, not directly.

You will have to fetch data from the remote site to get information about the file. Usually this is done with a HEAD request, but some (most?) servers haven't implemented it correctly and deliver the whole file, just like doing a GET request. Assuming that you have curl installed:

curl -s -v -X HEAD http://foo.com/bar/baz.pdf 2>&1 | grep '^< Last-Modified:'

might give you what you want, but as said, it highly depends on the server.

Solution 2:

The server response does usually have Last-Modified field, you can check it without downloading the file. No need to use -X HEAD, there's a special option -I for that (the -s suppresses progress output):

curl -sI http://example.com/bar/example.pdf | grep -i Last-Modified

Also in my case there's no curl installed (I'm doing a script for an embedded device), just wget. The way with wget is:

wget --server-response --spider http://example.com/bar/example.pdf 2>&1 | grep -i Last-Modified

The --server-response prints headers, and the --spider option forces to not download pages, but rather check their existence.