Shell scripting: Select folder based on part of file name

My project

I'm creating a bash shell script to execute from the Terminal. Its purpose is to archive lots and lots of project folders. Each folder follows a prescribed nomenclature: [YYYY.MM.DD] - Medium - Client - Project name - details--details - JobNumber. For example: [2006.02.01] - Print - Development - Appeal I - Kids Art Show Insert - D0601-11. These projects are currently one folder. I want to sort them into folders by Client name. There are 7 (internal) clients, so I'm using the following shell script:

#!/bin/bash

# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/

# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)

for folder in *; do
    if [[ -d "$folder" ]]; then
        if [[ "$folder" == *Academics* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Academics/
        fi
        elif [[ "$folder" == *Admissions* ]]; then
            echo "Archiving $folder to Archived Projects → Admissions...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Admissions/
        fi
        elif [[ "$folder" == *Alumni* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Alumni/
        fi
        elif [[ "$folder" == *Communications* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Communications/
        fi
        elif [[ "$folder" == *Development* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Development/
        fi
        elif [[ "$folder" == *President* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/President/
        fi
        elif [[ "$folder" == *Student\ Life* ]]; then
            echo "Archiving $folder to Archived Projects → Academics...";
            mv "$folder" /Volumes/communications/Projects/Archived\ Projects/Student\ Life/
        fi
    else #Folders that don't match the pattern prompt the use to move them by hand.
        echo "$folder does not have a Department name. Move it by 
done

My problem

My script would mis-parse and mis-file a project named [2006.03.01] - Print - Development - Academics and Accreditation - D0601-08. It would read "Academics" before it ever got to the conditional for the client "Development". As a result, it would be files into "Academics". And I'd have to pick it back out by hand!

My system's advantage

My colleagues and I have been scrupulous about our nomenclature (described above). I know that the Client name falls in between the 2nd and 3rd hyphens.

My question

How to leverage my system's advantage to solve my problem? I want this script to match only the part of the folder name that comes after the first two hyphens and before the third hyphen, i.e., I only want this script to search the Client "field" in the folder name. I keep thinking "regular expressions" but have no idea how to implement them.

Note: I prefer for a solution to augment my current script, rather than replace it. I arrived at it via @patrix on this site and his idea circumvented some errors.


There are several ways to get this done in bash and friends (you could really knock yourself out using sed or awk). A rather simple way is to use cut to get the name of the folder

if [[ -d "$folder" ]]; then
    target=$(echo $(echo "$folder" | cut -d- -f 3))
    echo "Archiving $folder to Archived Projects → $target...";
    mv "$folder" /Volumes/communications/Projects/Archived\ Projects/$target/
fi

The $(echo $(echo ... )) is a lazy approach to get rid of the leading/trailing space (because cut doesn't support multi-char delimiters).


If you want to knock yourself out with sed you can use

    target=$(echo "$folder" | sed -n 's/^[^\-]*-[^\-]*- \([^\-]*\) -.*/\1/p')

instead of cut. This only works if the target folder name doesn't contain a - itself.


Instead of pattern matching you could also use a shell function to encapsulate most of the complexity.

#!/bin/bash

function checkAndMove() {
    if [[ "$1" == *$2* ]]; then
        echo "Archiving $1 to Archived Projects → $2...";
        mv "$1" /Volumes/communications/Projects/Archived\ Projects/$2/
    fi
}

cd /Volumes/communications/Projects/Completed\ Projects/

for folder in *; do
    if [[ -d "$folder" ]]; then
        checkAndMove Academics
        checkAndMove Admissions
        ...
    fi
done

How about using awk with the field separator option -F and separate the field by the hyphen. Then get the third field.

UPDATE

I have updated the code to use the result returned from the awk to place the destination folder. This saves on a lot of code. And also used the separator " - " as Ian C pointed out in the comments.

#!/bin/bash

# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/

# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)

for folder in *; do
    if [[ -d "$folder" ]]; then
        thirdfield=`echo "$folder" | /usr/bin/awk -F ' - ' '{print $3}'`;
        echo "Archiving $folder to Archived Projects → $thirdfield...";
        mv "$folder" /Volumes/communications/Projects/Archived\ Projects/"$thirdfield"/"$folder"    
    fi     
done

I have also added /"$folder" at the end of the move so the folder itself is moved. you can change this if thats not what you want by removing the "$folder" from the end of the mv command.


You can also cross check against an array of the 7 names so only those folders that correspond will be moved. ( you can insert an else statement where needed)

#!/bin/bash

# Go to the Completed Projects folder.
cd /Volumes/communications/Projects/Completed\ Projects/

# Find a folder with a specified string (e.g. "Academics") in its name.
# Move (not copy) the folder to its corresponding sub-folder of the Archived Projects folder. (e.g. /Academics)

# Array of names to check against
ArrayName=(Academics Admissions  Alumni Communications Development President Student)

for folder in *; do
    if [[ -d "$folder" ]]; then
        thirdfield=`echo "$folder" | /usr/bin/awk -F ' - ' '{print $3}'`;

        for var in "${ArrayName[@]}"; do
            # Only move the folder if its key name exists in the arrary
            if [ "${var}" = "$thirdfield" ]; then
                echo "Archiving $folder to Archived Projects → $thirdfield...";
                mv "$folder" /Volumes/communications/Projects/Archived\ Projects/"$thirdfield"/"$folder"   
            fi
        done
    fi
done