How to force wget to overwrite an existing file ignoring timestamp?
Solution 1:
If you specify the output file using the -O
option it will overwrite any existing file.
For example:
wget -O index.html bbc.co.uk
Run multiple times will keep over-writting index.html.
Solution 2:
wget
doesn't let you overwrite an existing file unless you explicitly name the output file on the command line with option -O
.
I'm a bit lazy and I don't want to type the output file name on the command line when it is already known from the downloaded file. Therefore, I use curl like this:
curl -O http://ftp.vim.org/vim/runtime/spell/fr.utf-8.spl
Be careful when downloading files like this from unsafe sites. The above command will write a file named as the connected web site wishes to name it (inside the current directory though). The final name may be hidden through redirections and php scripts or be obfuscated in the URL. You might end up overwriting a file you don't want to overwrite.
And if you ever find a file named ls
or any other enticing name in the current directory after using curl
that way, refrain from executing the downloaded file. It may be a trojan downloaded from a rogue or corrupted web site!
Solution 3:
wget --backups=1 google.com
renames original file with .1
suffix and writes new file to the intended filename.
Not exactly what was requested, but could be handy in some cases.
Solution 4:
-c
or --continue
From the manual:
If you use ‘-c’ on a non-empty file, and the server does not support continued downloading, Wget will restart the download from scratch and overwrite the existing file entirely.
Solution 5:
I like the -c option. I started with the man page then the web but I've searched for this several times. Like if you're relaying a webcam so the image needs to always be named image.jpg. Seems like it should be more clear in the man page.
I've been using this for a couple years to download things in the background, sometimes combined with "limit-rate = " in my wgetrc file
while true
do
wget -c -i url.txt && break
echo "Restarting wget"
sleep 2
done
Make a little file called url.txt and paste the file's URL into it. Set this script up in your path or maybe as an alias and run it. It keeps retrying the download until there's no error. Sometimes at the end it gets into a loop displaying
416 Requested Range Not Satisfiable
The file is already fully retrieved; nothing to do.
but that's harmless, just ctrl-c it. I think it's always gotten the file I wanted even if wget runs out of retries or the connection temporarily goes away. I've downloaded things for days at a time with it. A CD image on dialup, yes, always with wget.