How do I find and delete duplicate music tracks?
Solution 1:
You can use fdupes
like the answer for question »How to find and delete duplicate files« suggested. Let me give an example:
mkdir -p "Music/Prefuse 73/One Word Extinguisher/"
dd if=/dev/urandom of=Music/Prefuse\ 73/One\ Word\ Extinguisher/07.Detchibe.mp3 bs=1023 count=2048
2048+0 records in
2048+0 records out
2095104 bytes (2.1 MB) copied, 0.379806 s, 5.5 MB/s
cp Music/Prefuse\ 73/One\ Word\ Extinguisher/07.Detchibe.mp3 Music/Prefuse\ 73/One\ Word\ Extinguisher/"07 - Detchibe.mp3"
fdupes -rd .
[1] ./Music/Prefuse 73/One Word Extinguisher/07.Detchibe.mp3
[2] ./Music/Prefuse 73/One Word Extinguisher/07 - Detchibe.mp3
Set 1 of 1, preserve files [1 - 2, all]:
First I created the directory like in your example. The I made a file from random data and copied its contents to another files. When I run fdupes -rd
the software finds the two exact files and asks which one to delete.
If you have lots of files, you can use the option -1
. fdupes
will print all duplicates on a single line. You can process them with xargs
and other shell features.
Solution 2:
I found a somewhat simple command chain. Much thanks to @Oli.
fdupes -rf --quiet ~/Desktop/Dupes2/ | while read i; do mv "$i" ~/Desktop/Dupes/ ; done
This used fdupes
to recursively (-r
) find the dupes, omitting the first (-f
). Bash reads this line by line through read
amd hands each line to mv
to move all duplicates to another directory. Note the use of quotes in the while
loop to handle spaces and other dodgy punctuation that fdupes
will not handle (even with -1
/--sameline
).
Solution 3:
In the answers to Manually set track listen count in Banshee?, it describes how to get at the database that banshee uses to save all track information.
Once you're connected to the database, on the execute query table, paste
select tweaked_track, count(*) from
(select replace(replace(replace(title, ' ', ''), '-', ''), '.', '') as tweaked_track
from coretracks)
group by tweaked_track
order by 2, 1 desc;
into the SQL string box, then click 'execute query'. This will show you all the tracks you have with the same title ignoring spaces, dashes, and periods. If there are other characters you want to ignore, add them to the query in the same pattern. (IE add replace(
before the first existing "replace" and after the last ")" on that line, add , '[character you want removed]', '')
.
(I don't know how much you know about sql - if you need more details, post a comment.)
This will give you a list of titles. You will have to actually do the delete yourself.
There may be a better way of doing this, but if there is, I don't know about it.
Once you have a big list of files to be deleted (either from my method or from fdupes
like others have mentioned), put the list of files you want to delete into a text directory. Make sure one of the following is true:
Option #1: The filenames contain full path. For example the file might contain:
/home/doneill/music/weird_al/duped_file.mp3
/home/doneill/music/weird_al/another_dupe.mp3
/home/doneill/music/bach/baroque_dupe.mp3
Option #2: The filenames contain relative path, and the file with the list of filenames is saved in the parent folder. For example, if your file list was saved in /home/doneill/music/
, it would contain:
weird_al/duped_file.mp3
weird_al/another_dupe.mp3
bach/baroque_dupe.mp3
In either case, open up a terminal window, and change to the folder that contains the file with the list cd /home/doneill/music/
for example.
Type in:
for a in `cat filelist.txt`; do echo $a; done
(Replacing filelist.txt with the name of the file with the list). This should spit out a list of all the files you want to delete. Take a moment to double check the list. If it is right, type:
for a in `cat filelist.txt`; do rm $a; done
This basically tells your computer: for each line in the file filelist.txt
, remove a file with the name listed.