Download in batch with wget and modify files with a python script immediately after download

I want to download climate data from the CHELSA database.

One way to do so programmatically is to use wget, following their guidelines:

Download the file (envidatS3paths.txt), install wget and then run the command: wget --no-host-directories --force-directories --input-file=envidatS3paths.txt .

However, for each file that are downloaded, I would like to perform a operation on them (basically, trimming the data because they are quite big).

I looked at the wget manual, but I could not find anything related to an intermediary script to run inbetween downloads.

I could possibly run a second background command to finds any new downloaded file and trim it, but I wonder if the first solution could be more straightforward.

Solution 1:

you can run a for loop over the input file and for each file run wget -O $new_file_name $url

try something like this -

bash

for url in $(cat envidatS3paths.txt); do wget -O $(echo $url | sed "s/\//_/g").out $url  ; done

python

for url in opened_file:
    subprocess.Popen(f'wget -O {url.rsplit('\')[1]} {url}')

Download in batch with wget and modify files with a python script immediately after download

Solution 1:

Related

Recent Posts