Copy only files from a list into a new directory from a large directory structure
I have a huge collection of flower pictures since I'm preparing a book. I'll use only some of them contained in a file called choosen.txt (actually a csv). I have to pick up the chosen ones from a folder tree organized in this way:
Pictures
├── Flowers 2020
│ ├── 1.- Spring 2020
│ ├── 2.- Summer 2020
│ ├── 3.- Autumn 2020
│ └── 4.- Winter 2020
└── Flowers 2021
├── 1.- Spring 2021
├── 2.- Summer 2021
├── 3.- Autumn 2021
└── 4.- Winter 2021
10 directories, 5260 files
and copy every single chosen flower picture to a new folder called Chosen-ones (for example). Hope I've explained right what I need to be done.
Below are some lines of the csv file chosen.txt
DSC10233.jpg
DSC10276.jpg
DSC10288.jpg
DSC10399.jpg
DSC10448.jpg
DSC10489.jpg
DSC10492.jpg
Is it possible using Bash or maybe a Python script?
Thanks in advance.
Sure, you can do this with a very small Bash script.
I'm assuming that you have the list chosen.txt
in your home directory, of which Pictures is a subdirectory. If this isn't the case, please adjust the paths accordingly.
First make the directory to move the files into. I'm assuming you've just opened a terminal and you're in your home directory. You can move the directory later.
mkdir ChosenOnes
Now check if you can find the correct files using your list like this:
while read -r line; do find Pictures -name "$line" -ls; done < chosen.txt
If the result looks correct, you can copy the files by adjusting the command:
while read -r line; do find Pictures -name "$line" -exec cp -vt ChosenOnes {} \; ; done < chosen.txt
We can make that look a bit better:
#!/bin/bash
# read our list and
while read -r line; do
# find the files in it and copy them to the new directory
find Pictures -name "$line" -exec cp -vt ChosenOnes {} \;
done < chosen.txt
Explanation
-
while read -r line; do things; done < input-file
Awhile
loop keeps on doing something as long as a condition holds. Here we are asking our list to be read line by line. Each line is going to be put into the variableline
so that we can run some command(s) on it. When we're done with our command(s) on that line, the next line will be read, until we run out of lines in our file. -
find path -name "$line"
Thefind
command does a recursive search down from the given path (Pictures in our case). Here we use the-name
option to find files matching the names in the list. -
-ls
Thefind
command has an option to list out the found files. This is useful for checking on what's been found before taking any action -
-exec command {} \;
The-exec
option to find runs the given command on the files that have been found (represented by{}
) -
cp -vt
The-v
option makescp
tell us what it's doing. The-t
option specifies the destination (we give the destination directory immediately after it); otherwise the destination will be assumed to be the last argument.
Yes, there are several ways to this on the command-line.
Assuming that you have only a list of file names (without the path), you can use a while
loop to read your "chosen.txt" file, find
to find the files in your source folder, and another while
loop to copy them.
Defining variables lets you re-use the long command without having to edit it.
list="/path/to/chosen.txt"
source="/path/to/Pictures"
dest="/path/to/destination"
cat "chosen.txt" \
| while read n; do
find "$source" -name "$n" \
| while read f; do
cp "$f" "$dest/"
done
done
Or on a single line:
cat "chosen.txt" | while read n; do find "$source" -name "$n" | while read f; do cp "$f" "$dest/"; done; done
An alternative, if your original .csv file also contains the path to the files, would be to build a list from that. Then you could use rsync
with the --files-from=
option to copy them by maintaining the original directory structure of the source.