Download in batch with wget and modify files with a python script immediately after download
I want to download climate data from the CHELSA database.
One way to do so programmatically is to use wget, following their guidelines:
Download the file (envidatS3paths.txt), install wget and then run the command:
wget --no-host-directories --force-directories --input-file=envidatS3paths.txt
.
However, for each file that are downloaded, I would like to perform a operation on them (basically, trimming the data because they are quite big).
I looked at the wget manual, but I could not find anything related to an intermediary script to run inbetween downloads.
I could possibly run a second background command to finds any new downloaded file and trim it, but I wonder if the first solution could be more straightforward.
Solution 1:
you can run a for loop over the input file and for each file run wget -O $new_file_name $url
try something like this -
bash
for url in $(cat envidatS3paths.txt); do wget -O $(echo $url | sed "s/\//_/g").out $url ; done
python
for url in opened_file:
subprocess.Popen(f'wget -O {url.rsplit('\')[1]} {url}')