How to remove duplicate dirs from $PATH?

In some of my terminal windows I have duplicate entries in the PATH variable; something like this:

PATH=/a/b:/c/d:/a/b:/c/d:/e/f:/a/b

I guess the culprits are lines like this one in some of my scripts:

PATH=/a/b:$PATH

After sourcing this and that and this again PATH becomes very long. This the question:

Is there a bash command to cleanup PATH and similar env variables? It ought to be a bash script because one cannot execute a utility and expect it change the environment of the calling shell.

In the above example the cleaned up PATH should look like this:

PATH=/a/b:/c/d:/e/f

Solution 1:

It's better not to create the duplicates than to try and remove them afterwards. This is easily avoided with the technique I use in my .bashrc for adding my private bin/ directory:

[ "${PATH#*$HOME/bin:}" == "$PATH" ] && export PATH="$HOME/bin:$PATH"

I did this at a time when I was making updates to .bashrc and I wanted to rerun it without restarting the shell.

If you want to add a directory to the end of $PATH you need to use a leading colon:

[ "${PATH#*:$HOME/bin}" == "$PATH" ] && export PATH="$PATH:$HOME/bin"

You can use parameter expansion to step through PATH and remove duplicates, but it would be a bit complex, and you would need to decide which position should be kept. Something along the lines of:-

OLDPATH="$PATH"; NEWPATH=""; colon=""
while [ "${OLDPATH#*:}" != "$OLDPATH" ]
do  entry="${OLDPATH%%:*}"; search=":${OLDPATH#*:}:"
    [ "${search#*:$entry:}" == "$search" ] && NEWPATH="$NEWPATH$colon$entry" && colon=:
    OLDPATH="${OLDPATH#*:}"
done
NEWPATH="$NEWPATH:$OLDPATH"
export PATH="$NEWPATH"

After having written this on the fly and now tested it, I should have removed most errors, and it should be an adequate guide to what you would need to do. It leaves the last occurrence of any duplicates, which is where they would be if you used my script to avoid duplicates in the first place. In a script it would of course need to be called with the ./source command.

Solution 2:

I wrote this simple Python 3 script:

import os

# grab $PATH
path = os.environ['PATH'].split(':')

# remove 'PATH=' prefix from first var
if 'PATH=' == path[0][:5]:
    path[0] = path[0][5:] 

# normalize all paths
path = map(os.path.normpath, path)

# remove duplicates via a dictionary
clean = dict.fromkeys(path)

# combine back into one path
clean_path = ':'.join(clean.keys())

# dump to stdout
print(f"PATH={clean_path}")

I placed the script in $HOME/scripts, then added this line to the end of my .bashrc:

eval $(python $HOME/scripts/clean-path.py)

As of Python 3.7, a dictionary is guaranteed to preserve order, so clean_path will have directories in the same order as the original PATH.