BASH script: Downloading consecutive numbered files with wget
I have a web server that saves the logs files of a web application numbered. A file name example for this would be:
dbsclog01s001.log
dbsclog01s002.log
dbsclog01s003.log
The last 3 digits are the counter and they can get sometime up to 100.
I usually open a web browser, browse to the file like:
http://someaddress.com/logs/dbsclog01s001.log
and save the files. This of course gets a bit annoying when you get 50 logs. I tried to come up with a BASH script for using wget and passing
http://someaddress.com/logs/dbsclog01s*.log
but I am having problems with my the script. Anyway, anyone has a sample on how to do this?
thanks!
#!/bin/sh
if [ $# -lt 3 ]; then
echo "Usage: $0 url_format seq_start seq_end [wget_args]"
exit
fi
url_format=$1
seq_start=$2
seq_end=$3
shift 3
printf "$url_format\\n" `seq $seq_start $seq_end` | wget -i- "$@"
Save the above as seq_wget
, give it execution permission (chmod +x seq_wget
), and then run, for example:
$ ./seq_wget http://someaddress.com/logs/dbsclog01s%03d.log 1 50
Or, if you have Bash 4.0, you could just type
$ wget http://someaddress.com/logs/dbsclog01s{001..050}.log
Or, if you have curl
instead of wget
, you could follow Dennis Williamson's answer.
curl
seems to support ranges. From the man
page:
URL The URL syntax is protocol dependent. You’ll find a detailed descrip‐ tion in RFC 3986. You can specify multiple URLs or parts of URLs by writing part sets within braces as in: http://site.{one,two,three}.com or you can get sequences of alphanumeric series by using [] as in: ftp://ftp.numericals.com/file[1-100].txt ftp://ftp.numericals.com/file[001-100].txt (with leading zeros) ftp://ftp.letters.com/file[a-z].txt No nesting of the sequences is supported at the moment, but you can use several ones next to each other: http://any.org/archive[1996-1999]/vol[1-4]/part{a,b,c}.html You can specify any amount of URLs on the command line. They will be fetched in a sequential manner in the specified order. Since curl 7.15.1 you can also specify step counter for the ranges, so that you can get every Nth number or letter: http://www.numericals.com/file[1-100:10].txt http://www.letters.com/file[a-z:2].txt
You may have noticed that it says "with leading zeros"!
You can use echo type sequences in the wget url to download a string of numbers...
wget http://someaddress.com/logs/dbsclog01s00{1..3}.log
This also works with letters
{a..z} {A..Z}
Not sure precisely what problems you were experiencing, but it sounds like a simple for loop in bash would do it for you.
for i in {1..999}; do
wget -k http://someaddress.com/logs/dbsclog01s$i.log -O your_local_output_dir_$i;
done