How to download list of files from a file server?
Solution 1:
You can specify what file extensions wget
will download when crawling pages:
wget -r -A zip,rpm,tar.gz www.site.com/startpage.html
this will perform a recursive search and only download files with the .zip
, .rpm
, and .tar.gz
extensions.
Solution 2:
supposing you really just want a list of the files on the server without fetching them (yet):
%> wget -r -np --spider http://www.apache.org/dist/httpd/binaries/ 2>&1 | awk -f filter.awk | uniq
while 'filter.awk' looks like this
/^--.*-- http:\/\/.*[^\/]$/ { u=$3; } /^Length: [[:digit:]]+/ { print u; }
then you possibly have to filter out some entries like
"http://www.apache.org/dist/httpd/binaries/?C=N;O=D"