Shell scripting: Select folder based on part of file name
My project
I'm creating a bash shell script to execute from the Terminal. Its purpose is to archive lots and lots of project folders. Each folder follows a prescribed nomenclature: [YYYY.MM.DD] - Medium - Client - Project name - details--details - JobNumber
. For example: [2006.02.01] - Print - Development - Appeal I - Kids Art Show Insert - D0601-11
. These projects are currently one folder. I want to sort them into folders by Client name. There are 7 (internal) clients, so I'm using the following shell script:
#!/bin/bash
# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/
# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)
for folder in *; do
if [[ -d "$folder" ]]; then
if [[ "$folder" == *Academics* ]]; then
echo "Archiving $folder to Archived Projects → Academics...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Academics/
fi
elif [[ "$folder" == *Admissions* ]]; then
echo "Archiving $folder to Archived Projects → Admissions...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Admissions/
fi
elif [[ "$folder" == *Alumni* ]]; then
echo "Archiving $folder to Archived Projects → Academics...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Alumni/
fi
elif [[ "$folder" == *Communications* ]]; then
echo "Archiving $folder to Archived Projects → Academics...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Communications/
fi
elif [[ "$folder" == *Development* ]]; then
echo "Archiving $folder to Archived Projects → Academics...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Development/
fi
elif [[ "$folder" == *President* ]]; then
echo "Archiving $folder to Archived Projects → Academics...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/President/
fi
elif [[ "$folder" == *Student\ Life* ]]; then
echo "Archiving $folder to Archived Projects → Academics...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Student\ Life/
fi
else #Folders that don't match the pattern prompt the use to move them by hand.
echo "$folder does not have a Department name. Move it by
done
My problem
My script would mis-parse and mis-file a project named [2006.03.01] - Print - Development - Academics and Accreditation - D0601-08
. It would read "Academics" before it ever got to the conditional for the client "Development". As a result, it would be files into "Academics". And I'd have to pick it back out by hand!
My system's advantage
My colleagues and I have been scrupulous about our nomenclature (described above). I know that the Client name falls in between the 2nd and 3rd hyphens.
My question
How to leverage my system's advantage to solve my problem? I want this script to match only the part of the folder name that comes after the first two hyphens and before the third hyphen, i.e., I only want this script to search the Client "field" in the folder name. I keep thinking "regular expressions" but have no idea how to implement them.
Note: I prefer for a solution to augment my current script, rather than replace it. I arrived at it via @patrix on this site and his idea circumvented some errors.
There are several ways to get this done in bash
and friends (you could really knock yourself out using sed
or awk
). A rather simple way is to use cut
to get the name of the folder
if [[ -d "$folder" ]]; then
target=$(echo $(echo "$folder" | cut -d- -f 3))
echo "Archiving $folder to Archived Projects → $target...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/$target/
fi
The $(echo $(echo ... ))
is a lazy approach to get rid of the leading/trailing space (because cut
doesn't support multi-char delimiters).
If you want to knock yourself out with sed
you can use
target=$(echo "$folder" | sed -n 's/^[^\-]*-[^\-]*- \([^\-]*\) -.*/\1/p')
instead of cut
. This only works if the target folder name doesn't contain a -
itself.
Instead of pattern matching you could also use a shell function to encapsulate most of the complexity.
#!/bin/bash
function checkAndMove() {
if [[ "$1" == *$2* ]]; then
echo "Archiving $1 to Archived Projects → $2...";
mv "$1" /Volumes/communications/Projects/Archived\ Projects/$2/
fi
}
cd /Volumes/communications/Projects/Completed\ Projects/
for folder in *; do
if [[ -d "$folder" ]]; then
checkAndMove Academics
checkAndMove Admissions
...
fi
done
How about using awk with the field separator option -F and separate the field by the hyphen. Then get the third field.
UPDATE
I have updated the code to use the result returned from the awk to place the destination folder. This saves on a lot of code. And also used the separator " - " as Ian C pointed out in the comments.
#!/bin/bash
# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/
# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)
for folder in *; do
if [[ -d "$folder" ]]; then
thirdfield=`echo "$folder" | /usr/bin/awk -F ' - ' '{print $3}'`;
echo "Archiving $folder to Archived Projects → $thirdfield...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/"$thirdfield"/"$folder"
fi
done
I have also added /"$folder" at the end of the move so the folder itself is moved. you can change this if thats not what you want by removing the "$folder" from the end of the mv command.
You can also cross check against an array of the 7 names so only those folders that correspond will be moved. ( you can insert an else statement where needed)
#!/bin/bash
# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/
# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)
# Array of names to check against
ArrayName=(Academics Admissions Alumni Communications Development President Student)
for folder in *; do
if [[ -d "$folder" ]]; then
thirdfield=`echo "$folder" | /usr/bin/awk -F ' - ' '{print $3}'`;
for var in "${ArrayName[@]}"; do
# Only move the folder if its key name exists in the arrary
if [ "${var}" = "$thirdfield" ]; then
echo "Archiving $folder to Archived Projects → $thirdfield...";
mv "$folder" /Volumes/communications/Projects/Archived\ Projects/"$thirdfield"/"$folder"
fi
done
fi
done