With a time sorted List, How to insert a checksum for each file?
Time stamp example:
20211018_14:54:54.0596445490_Mon
Ubuntu 20.04.3 command below works, displaying
- Directories and
- Files and
- Hidden files with:
Permissions, Time_Day, f or d, Path/fileName with
find . -printf "%M %TY%Tm%Td_%TT_%Ta %Y%p\n" |sort -k2 ;
Sort by time, column 2.
Most recent files at bottom, 4 examples:
-rw-r--r-- 20211001_13:02:16.0000000000_Fri f./Bash/awkCommnads.txt
-rw-r--r-- 20211013_06:22:12.0000000000_Wed f./.HiddenFile_1.txt
drwxr-xr-x 20211018_14:51:42.1712136500_Mon d.
drwxr-xr-x 20211018_14:54:54.0596445490_Mon d./Bash
Said differently,
How to get 32 byte md5sum checksum for each file,
on Left side of above List that is sorted by time?
Example:
123456789T123456789w123456789Y12 -rw-r--r-- 20211001_13:02:16.0000000000_Fri f./Bash/awkCommnads.txt
Once md5 works then sha512sum.
Tip for testing:
setterm -linewrap off ; find . -printf "%M %TY%Tm%Td_%TT_%Ta %Y%p\n" |sort -k2 ; tput smam ;
For testing, one line per record, No Line wrap:
setterm -linewrap off ; Commands... ; tput smam ;
tput smam ; = linewrap on
Again,
With a time sorted List, How to insert a checksum for each file?
--
Solution 1:
You need to run md5sum
for each file (or rather for each regular file). You run arbitrary commands from find
with -exec
. The problem is find . -type f -exec md5sum {} \;
will print (i.e. md5sum
will print) also the pathname and a trailing newline, and sometimes a leading backslash. You need to get rid of them before you proceed to -printf
.
A straightforward way is with cut
and tr
. To execute md5sum … | cut … | tr …
inside find
, you need a shell there. Executing many processes per file is costly. You cannot use -exec … {} +
because you need md5sum
and -printf
to take turns. We will save few processes if we manage to make the shell do the job of cut
and tr
.
For a single file (e.g. /etc/fstab
) you can print its md5sum in the desired format, still without cut
or tr
, like this:
sh -c '
exec 2>/dev/null
sum="$(md5sum <"$1")" || sum="????????????????????????????????"
printf "%s " "${sum%% *}"
' sh /etc/fstab
With performance in mind we hope printf
is a builtin in your sh
. The above command is designed to show question marks if md5sum
fails. Useful links:
- What is the second sh in
sh -c 'some shell code' sh
? - Parameter expansion and quotes within quotes.
-
About
${sum%% *}
.
Now let's build the command into your find
. Like this:
find . -exec sh -c '
exec 2>/dev/null
sum="$(md5sum <"$1")" || sum="????????????????????????????????"
printf "%s " "${sum%% *}"
' find-sh {} \; -printf '%M %TY%Tm%Td_%TT_%Ta %Y%p\n' | sort -k3
Notes:
-
sort -k2
becamesort -k3
. -
The command will try to run
md5sum
for files of any type. E.g. for a directorymd5sum
will fail and you will get question marks, this is acceptable. On the other hand, for a fifomd5sum
may endlessly wait for data, this you don't want. Consider restrictingfind
to regular files (simply add-type f
as the first test) or fixing the command, so our-exec
happens for regular files only:find . \( -type f -exec sh -c ' exec 2>/dev/null sum="$(md5sum <"$1")" || sum="????????????????????????????????" printf "%s " "${sum%% *}" ' find-sh {} \; -o -printf '-------------------------------- ' \) \ -printf '%M %TY%Tm%Td_%TT_%Ta %Y%p\n' | sort -k3
The sequences of
?
or-
characters are of the length of any md5sum, so they align nicely. -
Newlines in pathnames will confuse your
sort
. If you can, use null-terminated strings. E.g. with GNUsort
in my Kubuntu:find … -printf '…\0' | sort -z … | tr '\0' '\n'