sha1sum for a directory of directories
sha1sum ./path/to/directory/* | sha1sum
the above was posted as a way to compute a sha1sum of a directory which contains files. This command fails if the directory includes more directories. Is there a way to recursively compute the sha1sum of a directory of directories universally (without custom fitting an algorithm to the particular directory in question)?
Thanks to this SO post —
find . -type f \( -exec sha1sum "$PWD"/{} \; \) | awk '{print $1}' | sort | sha1sum
Warning: This code is untested! Edit this question if it's wrong and you can fix it; I'll approve your edit.
I generally like the find | xargs
pattern, like so:
find /path/to/directory -type f -print0 | xargs -0 sha1sum
You have to use the "-print0" and "-0", in case there are spaces in file names.
However, this is very similar to the find -exec cmd {} \;
pattern.
Discussion https://stackoverflow.com/questions/896808
INTRODUCTION
A few years ago, I wrote and presented (in this very thread) a script that can check the hash signatures of all individual files in the current directory structure and output it as a list in a text file.
Since then, I've refined this formula several times. I've decided to re-post my new and improved script here as a separate answer. It's written for sha256 but anyone still wanting to use sha1 can do a simple search and replace in gedit to swap sha256 with sha1. Personally, I haven't used sha1 for a couple of years and I would not recommend it as it's become antiquated and google has demonstrated how it can be compromised.
Here is what my new script does:
-
You can simply use the script by going to the directory you want to hash and inputting:
sha256rec
Alternatively, you can call this script from another directory by doing:
sha256rec "/path/to/target/directory/you/want/hash"
Script will detect if you have write privileges in current dir. If you do, results will be saved in the current directory. If you don't have write privileges or if your current directory is in a read-only system (such as a cdrom), the results will be saved to current user's home directory.
Script will detect if some of the sub directories are not accessible at current user privileges. If all are readable then no elevation of privilege takes place, if they aren't, then the user's privileges are elevated to root.
Find is used to find all the files in current dir structure (including all the sub-directories). Sort is used to make sure the results are outputted alphabetically. The resulting list undergoes sha256sum and is outputted to a text file.
Since writing the old script I've adopted a design philosophy that temp files are evil and should be avoided when possible as they leave users open to snooping and tampering by malicious third parties. So all data in this new script are manipulated as variables until the very last minute where the results are outputed as a text file.
The resulting file itself is hashed and the path/hash are outputed in the terminal. I like to take pictures of these hashes with an old school offline camera to be able to ensure that the results file hasn't been tampered with when I refer to it at a later date.
Old result files are ignored in the tally. It makes comparing results easier.
Here is an example of the terminal output when running my script:
kernelcrunch@ubuntu:/usr/src/linux-headers-4.13.0-16-generic$ sha256rec
=======================================================================
sha256rec:
=======================================================================
Current Folder : /usr/src/linux-headers-4.13.0-16-generic
Target Folder : /usr/src/linux-headers-4.13.0-16-generic
Output File : /home/kernelcrunch/000_sha256sum_recurs_linux-headers-4.13.0-16-generic_d_22-04-2018_t_02.17.txt
Seems you're currently in either a Read-Only system or a root owned directory as a regular user. You can find the hash results in your home folder.
f3ddb06212622c375c6bcc11bd629ce38f6c48b7474054ca6f569ded4b4af9d8 /home/kernelcrunch/000_sha256sum_recurs_linux-headers-4.13.0-16-generic_d_22-04-2018_t_02.17.txt
Operation Length: 10 Seconds.
=======================================================================
kernelcrunch@ubuntu:/usr/src/linux-headers-4.13.0-16-generic$
Here is a snippet of the output that can be found in 000_sha256sum_recurs_linux-headers-4.13.0-16-generic_d_22-04-2018_t_02.17.txt:
79c3f378a42bd225642220cc1e4801deb35c046475bb069a96870ad773082805 ./.9491.d
2e336c69cde866c6f01a3495048d0ebc2871dd9c4cb5d647be029e0205d15ce6 ./.config
174f23ff7a7fba897bfb7cf17e9a501bcecacf7ef0c0d5cf030414c1e257d4e3 ./.config.old
389d83f546b250304a9a01bb3072ff79f9d9e380c8a2106cadbf714a872afe33 ./.missing-syscalls.d
035dc77da819101cb9889b4e515023dddd2c953f00d2653b87c6196a6560903e ./Module.symvers
b28054d7995233e6d003ceb9ed119a0b3354f5ccf77b8d687fc0353ae3c5bfb8 ./arch/x86/include/generated/asm/.syscalls_32.h.cmd
01cf821170e3e6e592e36a96e8628377151c762ac2ee3210c96004bfaef22f5f ./arch/x86/include/generated/asm/.syscalls_64.h.cmd
111efa83187c58a74a9b0170fd496b497b0682d109a7c240c17e2ffcc734f4f4 ./arch/x86/include/generated/asm/.unistd_32_ia32.h.cmd
fcba4e8abf9e95472c31708555db844ac43c87260fb0ba706b6f519404bf9aba ./arch/x86/include/generated/asm/.unistd_64_x32.h.cmd
3264438a54cbf7e62b05d38a93c5df8fe4202ac782a5d83ed202cba9eee71139 ./arch/x86/include/generated/asm/.xen-hypercalls.h.cmd
4bd7a45837da7de379b87242efe562ce06bf9d8ab8f636c205bb5ef384c8f759 ./arch/x86/include/generated/asm/clkdev.h
0d96461abd23bbf2da522822948455413a345f9ef8ac7a7f81c6126584b3c964 ./arch/x86/include/generated/asm/dma-contiguous.h
b1a54c24a12ce2c0f283661121974436cdb09ae91822497458072f5f97447c5d ./arch/x86/include/generated/asm/early_ioremap.h
dd864107295503e102ea339e0fd4496204c697bdd5c1b1a35864dfefe504a990 ./arch/x86/include/generated/asm/mcs_spinlock.h
782ce66804d000472b3c601978fa9bd98dcf3b2750d608c684dc52dd1aa0eb7e ./arch/x86/include/generated/asm/mm-arch-hooks.h
cd9913197f90cd06e55b19be1e02746655b5e52e388f13ec29032294c2f75897 ./arch/x86/include/generated/asm/syscalls_32.h
758ce35908e8cfeec956f57a206d8064a83a49298e47d47b7e9a7d37b5d96d59 ./arch/x86/include/generated/asm/syscalls_64.h
1147ca3a8443d9ccbdf9cd1f4b9b633f0b77f0559b83ec5e4fa594eadb2548be ./arch/x86/include/generated/asm/unistd_32_ia32.h
ca5223fbf8f03613a6b000e20eb275d9b8081c8059bc540481a303ce722d42f3 ./arch/x86/include/generated/asm/unistd_64_x32.h
31703052c0d2ab8fe14b4e5dfcc45fcbd5feb5016b0a729b6ba92caa52b069e2 ./arch/x86/include/generated/asm/xen-hypercalls.h
c085ff1b6e9d06faa3fc6a55f69f9065c54098d206827deec7fe0a59d316fc99 ./arch/x86/include/generated/uapi/asm/.unistd_32.h.cmd
7929c16d349845cebb9e303e0ff15f67d924cac42940d0f7271584f1346635fc ./arch/x86/include/generated/uapi/asm/.unistd_64.h.cmd
9aa492c5a75f5547f8d1dc454bef78189b8f262d1c4b00323a577907f138a63e ./arch/x86/include/generated/uapi/asm/.unistd_x32.h.cmd
f568e151bbbb5d51fd531604a4a5ca9f17004142cd38ce019f0d5c661d32e36b ./arch/x86/include/generated/uapi/asm/unistd_32.h
c45cf378498aa06b808bb9ccf5c3c4518e26501667f06c907a385671c60f14ae ./arch/x86/include/generated/uapi/asm/unistd_64.h
a0088d8d86d7fd96798faa32aa427ed87743d3a0db76605b153d5124845161e2 ./arch/x86/include/generated/uapi/asm/unistd_x32.h
e757eb6420dffa6b24b7aa38ca57e6d6f0bfa7d6f3ea23bbc08789c7e31d15fa ./arch/x86/kernel/.asm-offsets.s.cmd
f9e703e4f148d370d445c2f8c95f4a1b1ccde28c149cff2db5067c949a63d542 ./arch/x86/kernel/asm-offsets.s
7971fb3e0cc3a3564302b9a3e1ad188d2a00b653189968bbc155d42c70ce6fbf ./arch/x86/purgatory/.entry64.o.cmd
8352d79fe81d2cf694880f428e283d79fd4b498cea5a425644da25a9641be26b ./arch/x86/purgatory/.kexec-purgatory.c.cmd
37f3edbee777e955ba3b402098cb6c07500cf9dc7e1d44737f772ac222e6eb3e ./arch/x86/purgatory/.purgatory.o.cmd
bb8b895cbd2611b69e2f46c2565b4c2e63a85afb56cff946a555f2d277ee99b2 ./arch/x86/purgatory/.purgatory.ro.cmd
bcc2365c9d3d027f1469806eb4f77b0f3ede6eb0855ea0fcd28aa65884046a54 ./arch/x86/purgatory/.setup-x86_64.o.cmd
872229f334fdcc8562e31b9f6581008c1571ac91f12889cd0ff413590585155a ./arch/x86/purgatory/.sha256.o.cmd
6fb0cbef120aadee282f7bc3b5ea2f912980f16712281f8f7b65901005194422 ./arch/x86/purgatory/.stack.o.cmd
cd1b61063ae3cf45ee0c58b2c55039f3eac5f67a5154726d288b4708c4d43deb ./arch/x86/purgatory/.string.o.cmd
e5826f0216fd590972bbc8162dd175f87f9f7140c8101505d8ca5849c850ec91 ./arch/x86/purgatory/entry64.o
(it goes on for another 7000+ lines like this but you get the idea)
INSTALLATION
-
Open a terminal and input the following commands:
cd /usr/bin sudo su echo '#!/bin/bash'> /usr/bin/sha256rec chmod +x /usr/bin/sha256rec touch /usr/bin/sha256rec nano /usr/bin/sha256rec
In nano, use Shif+Ctrl+v to paste. Ctrl-O and Enter to save. Ctr-X exits. Paste my script in there:
(paste after the #!/bin/bash)
#FUNCTIONS OR FUNCTYOU?
function s_readonly { err=$(date +%s%N); cd "$1"; mkdir $err 2> /tmp/$err; rmdir $err 2>/dev/null; echo $(cat /tmp/$err|grep -i "Read-only file system"|wc -l);shred -n 0 -uz /tmp/$err; }
function w_denied { echo $(err=$(date +%s%N); cd "$1"; mkdir $err 2> /tmp/$err; rmdir $err 2>/dev/null; cat /tmp/$err|grep -i "Permission denied"|wc -l;shred -n 0 -uz /tmp/$err); }
function r_denied { echo $(err=$(date +%s%N); cd "$1" >/dev/null 2> /tmp/$err; find . >/dev/null 2>> /tmp/$err; cat /tmp/$err|grep -i "Permission denied"|wc -l;shred -n 0 -uz /tmp/$err); }
function rando_name { rando=$(echo $(date +%s%N)|sha256sum|awk '{print $1}'); rando=${rando::$(shuf -i 30-77 -n 1)}; echo $rando;}
function ms0 { ms0=$(($(date +%s%N)/1000000)); }; function mstot { echo $(($(($(date +%s%N)/1000000))-$ms0));}
function s0 { s0=$(date +%s); }; function stot { echo $(($(date +%s)-$s0));}
s0
#CHECK IF A TARGET DIR WAS SPECIFIED (-t= or --target= switch)
if [ ! -z "$1" ]; then arg1="$1"; arg1_3=${arg1::3}; arg1_9=${arg1::9};fi
if [ "$arg1_3" = "-t=" -o "$arg1_9" = "--target=" ]; then
switch=$(echo $arg1|awk -F '=' '{print $1}')
switch_chr=$((${#switch}+1))
target=${arg1:$switch_chr}
current=$(pwd)
cd "$target"
arg1="" #<- cancels the not path in the find line
else
current=$(pwd)
target=$(pwd)
fi
echo -e "=======================================================================\
\nsha256rec: \
\n=======================================================================\
\nCurrent Folder : $current \
\nTarget Folder : $target"
#GETS DEFAULT_USER, ASSUME'S YOU'RE USER 1000, IF 1000 DOESN'T EXIST SEARCHES 999, THEN 1001, 1002
default_user=$(awk -v val=1000 -F ":" '$3==val{print $1}' /etc/passwd)
if [ -z "$default_user" ]; then default_user=$(awk -v val=999 -F ":" '$3==val{print $1}' /etc/passwd); fi
if [ -z "$default_user" ]; then default_user=$(awk -v val=1001 -F ":" '$3==val{print $1}' /etc/passwd); fi
if [ -z "$default_user" ]; then default_user=$(awk -v val=1002 -F ":" '$3==val{print $1}' /etc/passwd); fi
if [ "$(users | wc -l)" = "1" ]; then USER=$(users|awk '{print $1}'); else USER=$default_user;fi #not perfect but meh...
#running rando_name in this very specific spot between USER detection and Permission detection, some interfers somehow with detection functions...
#the rando function placed underneath the user detection is somehow turning c=$current from the dir path to whatever rando_name puts out.
#FIGURE OUT WHERE TO PUT HASH LIST
hash_file="000_sha256sum_recurs_${target##*/}_d_$(date +%d-%m-20%y)_t_$(date +%H.%M).txt"
if [ $(s_readonly "$current") -gt 0 -o $(w_denied "$current") -gt 0 ]; then if [ "$(whoami)" != root ]; then dest="/home/$(whoami)";echo -e "Output File : $dest/$hash_file\n\n";echo "Seems you're currently in either a Read-Only system or a root owned directory as a regular user. You can find the hash results in your home folder."; else dest="/home/$USER";echo -e "Output File : $dest/$hash_file\n\n";echo "Seems you're currently a Read-Only system. You can find the hash results in $USER's home folder.";fi; else dest="$current";echo -e "Output File : $dest/$hash_file\n\n";echo "Results will be saved here.";fi
#CAN REGULAR USER ACCESS TARGET DIR? ARE ALL IT'S SUBDIRS READABLE?
if [ $(r_denied "$target") -gt 0 ]; then sudo=sudo; echo "Some folder were not read-able as a regular user. User elevation will be required.";fi
#PERFORM RECURSIVE HASHING
command=$($sudo find . -type f -not -type l -not -path "$arg1" -not -path "$2" -not -path "$3" -not -path "$4" -not -path "$5" -not -path "$6" -not -path "$7" -not -path "$8" -not -path "$9" |grep -v "\./000_sha"|sort|awk "{print \"$sudo sha256sum \\\"\"\$0}"|awk '{print $0"\""}'|tr '\n' ';')
eval $command > "$dest/$hash_file"
sha256sum "$dest/$hash_file"
echo "Operation Length: $(stot) Seconds."
echo -e "======================================================================="
if [ "$target" != "$current" ]; then cd "$current";fi
exit
#||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
#||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
#||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
#||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
-
When you exit from nano, be sure to exit the elevated status by entering:
exit
FINAL THOUGHTS
This will only work if you have bash installed. I've used some synthax for substring manipulation that does not work with sh, dash, ksh, or zsh. You can still use any of the other shells as your daily drivers but bash needs to be installed.
Outputted lists can be compared with a variety tools such as: (in the terminal) diff, sdiff (and graphical) diffuse, kdiff, winmerge.
-
My file sorts the output based on path, to make it easier to read by humans. I've noticed the sort command working differently across different distros. For example, in one distro CAPITAL letters took priority over non-caps and in the other they did not. This affects the line order of output files and could make files difficult to compare. This should not present any issues if you're always using the script in the same distro but may if hashes lists were generated in two different environments. This is easily remedied by sorting hash files an additional time so that the lines become ordered by hash rather than path:
cat 000_sha256sum_oldhashlist|sort> ./old cat 000_sha256sum_newhashlist|sort> ./new sha256sum ./old ./new; diff ./old ./new
Another trick might be to use tar to hash the file contents & metadata:
tar -cf - ./path/to/directory | sha1sum
UPDATE: It's been a few years since I've posted this reply and in the meantime I've rewritten and improved the script I've presented here several times. I've decided to repost the new script as a brand new answer. I would highly recommend it over this one.
INTRODUCTION
I've observed that the order in which the find command outputs the found elements within a directory varies within identical directories on different partitions. If you're comparing the hashes of the same directory, you don't have to worry about that but if you're getting the hashes to ensure that no files were missed or corrupted during a copy, you need to include an additional line for sorting the content of the directory and it's elements. For example, Matthew Bohnsack's answer is quite elegant:
find ./path/to/directory/ -type f -print0 | xargs -0 sha1sum
But if you're using it to compare a copied directory to it's original, you would send the output to a txt file which you would compare to the outputted list from the other directory using Kompare or WinMerge or by simply getting the hashes of each lis. The thing is, as the order in which the find tool will output the content may vary from one directory to another, Kompare will signal many differences because the hashes weren't calculted in the same order. Not a big deal for small directories but quite annoying if you're dealing with 30000 files. Therefore, you have do the extra steps of sorting the output to make it easier to compare the hash lists between the two directories.
find ./path/to/directory/ -type f -print0 | xargs -0 sha1sum > sha1sum_list_unsorted.txt
sort sha1sum_list_unsorted.txt > sha1sum_list_sorted.txt
This would sort the output so that files with same hash are going to be on the same lines when running the differencing program (provided that no files are missing the new directory).
AND ONTO THE SCRIPT...
Here's a script that I wrote. It does what the same thing that the find/xarg answer does but it will sort the files before getting the sha1sum (keeping them in the same directory). The first line of the script finds all the files within the directory recursively. The next one sorts the results alphabetically. The following two, takes the sorted content and appends a sha1sum and quotation marks to the files in the sorted list, making a big shell script that calculates each files hash, one at a time and outputs it to content_sha1sum.txt.
#!/bin/bash
find . -type f > content.txt
sort content.txt > content_sorted.txt
awk '{print "sha1sum \""$0}' content_sorted.txt > temp.txt
awk '{print $0"\""}' temp.txt > get_sha1.sh
chmod +x get_sha1.sh
./get_sha1.sh > content_sha1sum.txt
rm content.txt
rm content_sorted.txt
rm temp.txt
rm get_sha1.sh
xdg-open content_sha1sum.txt
Hope this helps.