How can you remove duplicates from bash history?

Since I'm saving history from different sessions this is an issue where erasedups can't help because I'm using the following:

PROMPT_COMMAND="$PROMPT_COMMAND;history -a"

Is there an easy way to delete duplicates in history?


It is possible to remove duplicated lines which are already in .bash_history by running

nl ~/.bash_history | sort -k 2  -k 1,1nr| uniq -f 1 | sort -n | cut -f 2 > unduped_history

followed by

cp unduped_history ~/.bash_history

I would also recommend to put the following in your ~/.bashrc:

export HISTCONTROL=ignoreboth:erasedups

I've created a small Python script deduplicate.py for this which works nicely for searching the command history with fzf:

#!/usr/bin/env python

# Deduplicates and flips the lines of a file.
# Doesn't change the file, just writes the result
# to standard out.

import sys

if len(sys.argv) >= 2:
    unique_lines = []

    file_name = sys.argv[1]

    with open(file_name, 'r') as fi:
        for line in reversed(list(fi)):
            if line not in unique_lines:
                # We prepend the line so older entries
                # get to the start of the list.
                # That way, recently used commands in bash_history
                # are suggested first with fzf
                unique_lines.insert(0, line)

    for unique_line in unique_lines:
        print(unique_line, end='')

else:
    print('Please provide an input file path', file=sys.stderr)

I call it hourly with this systemd service located at ~/.config/systemd/user/deduplicate_bash_history.service:

[Unit]
Description=Remove duplicate lines from ~/.bash_history. This happens when multiple instances of bash write to that file. We go through the file line by line storing lines in a list. If we encounter a line that's already in the list, we don't add it.

[Service]
# We use -c to get a login shell for accessing the home directory
ExecStart=/bin/bash -c "~/scripts/deduplicate.py ~/.bash_history > ~/.bash_history_deduplicated && mv --force ~/.bash_history_deduplicated ~/.bash_history"
# We have to create a temporary file, since the following direct method would create a .bash_history containing only the entry of this command:
#/bin/bash -c "~/scripts/deduplicate.py ~/.bash_history > ~/.bash_history"

and this timer at ~/.config/systemd/user/deduplicate_bash_history.timer:

[Unit]
Description=Remove duplicate lines from ~/.bash_history. This happens when multiple instances of bash write to that file. We go through the file line by line storing lines in a list. If we encounter a line that's already in the list, we don't add it.

[Timer]
OnCalendar=hourly

[Install]
WantedBy=timers.target

I activate the timer with

systemctl --user daemon-reload && systemctl --user enable deduplicate_bash_history.timer

and make sure it's among the services with

systemctl --user list-timers --all