Create checksum sha256 of all files and directories?
Solution 1:
You can use find
to find all files in the directory tree, and let it run sha256sum
. The following command line will create checksums for the files in the current directory and its subdirectories.
find . -type f -exec sha256sum {} \;
I don't use the options -b
and -t
, but if you wish, you can use -b
for all files. The only difference that I notice is the asterisk in front of each file name.
Solution 2:
TL;DR
cd /path/to/working/directory
sha256sum <(find . -type f -exec sha256sum \; | sort)
Intro
A more complete answer to the one above, which fixes the problem with find
"finding" files in different orders on different systems.
Piping output to file, compare with diff
Firstly, you probably want to pipe the output to a file for comparison with diff. For this you would use
find . -type f -exec sha256sum {} \; > file1.lst
Then on your other system
find . -type f -exec sha256sum {} \; > file2.lst
rsync file2.lst user@host:/home/user/file2.lst
ssh user@host
diff file1.lst file2.lst # might not match due to order
Fixing order of files found with find
by piping to sort
Here I am assuming you are doing something similar to what I required this for - copying files from one system to another over a network and verifying the integrity of those files.
What I found was that the order in which find
finds files can vary between two systems, even when the OS is "Debian" in both cases.
Therefore, one needs to sort the output in the text files.
sort file1.lst > file1sorted.lst
sort file2.lst > file2sorted.lst
diff file1.lst file2.lst # bad
diff file1sorted.lst file2sorted.lst # ok
You can do the find
and sort
all in one line, while redirecting the output to a file.
find . -type f -exec sha256sum {} \; | sort > file1.lst
Other sha/md5 sums
You might want to have an increased level of shasumming. To use the 512 bit version simply do;
find . -type f -exec sha512sum {} \; | sort > file1.lst
Alternatively, 256 bit might be overkill for what you are doing, so do
find . -type f -exec md5sum {} \; | sort > file1.lst
A complete 1 line command to compare 2 directories with 1 shasum output
Now, if you have many files and do not want to save the output to a file, you could simply shasum the output. To do this, use
sha256sum <(find . -type -f -exec sha256sum \; | sort)
The pipe to sort
is required to ensure the output is sorted before computing the final sha256sum
. Without this, if find
finds files in a different order, despite the shasums for each file being correct, the overall shasum will depend on the order.
Problem relating to diff output and paths used
You may have some path which looks like
/A/B/C/*
where * are the subdirectories and files you are interested in shasumming. If A/B/C
are 1 or more directories containing only 1 subfolder you might end up accidentally running your shasum command in the wrong directory, resulting in the following
sort1.txt
sha256sum1 ./A/B/C/file1
sort2.txt
sha256sum2 ./B/C/file1
Even if sha256sum
= sha256sum2
diff will say the files are different. (Because they are due to the different base directory in the path.)
Here is a short python3 code to check the sums line by line, which solves this problem.
#!/usr/bin/env python3
file1_name = "sort1.txt"
file2_name = "sort2.txt"
file1 = open(file1_name, 'r')
file2 = open(file2_name, 'r')
file1_lines = file1.readlines();
file2_lines = file2.readlines();
if(len(file1_lines) == len(file2_lines)):
print("line numbers ok")
for i in range(len(file1_lines)):
line1 = file1_lines[i]
line2 = file2_lines[i]
line1_split = line1.split(' ')
line2_split = line2.split(' ')
shasum1 = line1_split[0]
shasum2 = line2_split[0]
if(shasum1 != shasum2):
print("shasum error: ", line1)
else:
print("Error: file ", file1_name, " number of lines != ", file2_name, " number of lines")
print("done")
I initially wanted to write a shell script to do this, but I got bored trying to figure out how to do it, so went back to python.
This makes me think that actually writing a python code to do the entire thing would have been easier, except for the find command.
Solution 3:
Late answer, but for the sake of documentation...
The other answers suggest to call sha256sum
via find
and the -exec
option. This has the effect that sha256sum
is called once for each file, which is a significant overhead for the OS starting processes.
A more efficient solution is to convert the find
results to command line arguments by piping it through xargs
and call sha256sum
that way. xargs
runs sha256sum
once or in large badges if there are too many lines.
find /path/to/your/dir -type f | xargs sha256sum -b
In case that you have filenames with whitespaces, use the -print0
flag in find
and -0
flag in xargs
to terminate strings with \0
find /path/to/your/dir -type f -print0 | xargs -0 sha256sum -b