Linux tools to find duplicate files?

I have a large and growing set of text files, which are all quite small (less than 100 bytes). I want to diff each possible pair of files and note which are duplicates. I could write a Python script to do this, but I'm wondering if there's an existing Linux command-line tool (or perhaps a simple combination of tools) that would do this?

Update (in response to mfinni comment): The files are all in a single directory, so they all have different filenames. (But they all have a filename extension in common, making it easy to select them all with a wildcard.)

Solution 1:

There's the fdupes. But I usually use a combination of find . -type f -exec md5sum '{}' \; | sort | uniq -d -w 36

Solution 2:

Well there is FSlint - which I haven't used for this particularly case, but I should be able to handle it: http://en.flossmanuals.net/FSlint/Introduction

Linux tools to find duplicate files?

Solution 1:

Solution 2:

Related

Recent Posts