Copy only files from a list into a new directory from a large directory structure

I have a huge collection of flower pictures since I'm preparing a book. I'll use only some of them contained in a file called choosen.txt (actually a csv). I have to pick up the chosen ones from a folder tree organized in this way:

Pictures
├── Flowers 2020
│   ├── 1.- Spring 2020
│   ├── 2.- Summer 2020
│   ├── 3.- Autumn 2020
│   └── 4.- Winter 2020
└── Flowers 2021
    ├── 1.- Spring 2021
    ├── 2.- Summer 2021
    ├── 3.- Autumn 2021
    └── 4.- Winter 2021

10 directories, 5260 files

and copy every single chosen flower picture to a new folder called Chosen-ones (for example). Hope I've explained right what I need to be done.

Below are some lines of the csv file chosen.txt

DSC10233.jpg
DSC10276.jpg
DSC10288.jpg
DSC10399.jpg
DSC10448.jpg
DSC10489.jpg
DSC10492.jpg

Is it possible using Bash or maybe a Python script?

Thanks in advance.


Sure, you can do this with a very small Bash script.

I'm assuming that you have the list chosen.txt in your home directory, of which Pictures is a subdirectory. If this isn't the case, please adjust the paths accordingly.

First make the directory to move the files into. I'm assuming you've just opened a terminal and you're in your home directory. You can move the directory later.

mkdir ChosenOnes

Now check if you can find the correct files using your list like this:

while read -r line; do find Pictures -name "$line" -ls; done < chosen.txt

If the result looks correct, you can copy the files by adjusting the command:

while read -r line; do find Pictures -name "$line" -exec cp -vt ChosenOnes {} \; ; done < chosen.txt

We can make that look a bit better:

#!/bin/bash
# read our list and
while read -r line; do
  # find the files in it and copy them to the new directory
  find Pictures -name "$line" -exec cp -vt ChosenOnes {} \;
done < chosen.txt

Explanation

  • while read -r line; do things; done < input-file A while loop keeps on doing something as long as a condition holds. Here we are asking our list to be read line by line. Each line is going to be put into the variable line so that we can run some command(s) on it. When we're done with our command(s) on that line, the next line will be read, until we run out of lines in our file.
  • find path -name "$line" The find command does a recursive search down from the given path (Pictures in our case). Here we use the -name option to find files matching the names in the list.
  • -ls The find command has an option to list out the found files. This is useful for checking on what's been found before taking any action
  • -exec command {} \; The -exec option to find runs the given command on the files that have been found (represented by {})
  • cp -vt The -v option makes cp tell us what it's doing. The -t option specifies the destination (we give the destination directory immediately after it); otherwise the destination will be assumed to be the last argument.

Yes, there are several ways to this on the command-line.

Assuming that you have only a list of file names (without the path), you can use a while loop to read your "chosen.txt" file, find to find the files in your source folder, and another while loop to copy them.

Defining variables lets you re-use the long command without having to edit it.

list="/path/to/chosen.txt"
source="/path/to/Pictures"
dest="/path/to/destination"
cat "chosen.txt" \
| while read n; do 
      find "$source" -name "$n" \
      | while read f; do
            cp "$f" "$dest/"
        done
  done

Or on a single line:

cat "chosen.txt" | while read n; do find "$source" -name "$n" | while read f; do cp "$f" "$dest/"; done; done

An alternative, if your original .csv file also contains the path to the files, would be to build a list from that. Then you could use rsync with the --files-from= option to copy them by maintaining the original directory structure of the source.