Bash script: Conditionally delete older files while keeping latest copies
Note: Though there's an answer from jeff-schaller, it depends on zsh
; so I would like to get an answer based on Bash
.
I like to create a Bash script to conditionally delete older files from a backup directory.
There are 2 conditions for 2 distinct file backups:
1, Keep the latest copy of Edge_Profile_*.tgz
, and delete rest of Edge_Profile_*.tgz
only if they are older than 5 days.
2, Keep the latest copy of Firefox_Profile_*.tgz
, and delete rest of Firefox_Profile_*.tgz
, doesn't matter how old are they.
Here's how I have modify this AskUbuntu answer: https://askubuntu.com/a/933098/928088
Script:
#!/bin/bash
LOG_FILE="/home/admn/Cleanup.log"
TEMP_LOG="/tmp/Temp_Cleanup.log"
mv $LOG_FILE $TEMP_LOG
{
cd /home/admn/Downloads/Test;
echo "Cleanup log:" `date`
find /home/admn/Downloads/Test/Edge_Profile_*.tgz -type f \( -mtime +5 -printf 'Y\t' -o -printf 'N\t' \) -printf '%A@\t%p\0' |
sort -zk2,2 | head -zn -1 | while read -r -d '' flag _ file; do \
case "$flag" in
'Y') echo rm "$file"
;;
*) echo "skipping $file (too new)"
;;
esac;
done
echo
find /home/admn/Downloads/Test/Firefox_Profile_*.tgz -type f \( -printf 'Y\t' -o -printf 'N\t' \) -printf '%A@\t%p\0' |
sort -zk2,2 | head -zn -1 | while read -r -d '' flag _ file; do \
case "$flag" in
'Y') echo rm "$file"
;;
*) echo "skipping $file (too new)"
;;
esac
done
} &>> $LOG_FILE
cat $TEMP_LOG >>$LOG_FILE
exit;
Output in the logfile with echo
:
/usr/local/scripts/cleanup.sh
rm /home/admn/Downloads/Test/Edge_Profile_Jul_06_2021_00-35.tgz
rm /home/admn/Downloads/Test/Edge_Profile_Jul_07_2021_00-35.tgz
....
skipping /home/admn/Downloads/Test/Edge_Profile_Jul_12_2021_00-35.tgz (too new)
skipping /home/admn/Downloads/Test/Edge_Profile_Jul_13_2021_00-35.tgz (too new)
....
rm /home/admn/Downloads/Test/Firefox_Profile_Jul_01_2021_00-35.tgz
rm /home/admn/Downloads/Test/Firefox_Profile_Jul_02_2021_00-35.tgz
....
Output in the logfile while actually running the script, without echo
:
/home/admn/Downloads/cleanup.sh: line 24: skipping /home/admn/Downloads/Test/Edge_Profile_Jul_12_2021_00-35.tgz (too new): No such file or directory
/home/admn/Downloads/cleanup.sh: line 24: skipping /home/admn/Downloads/Test/Edge_Profile_Jul_13_2021_00-35.tgz (too new): No such file or directory
....
Total files in the directory: 20 files
1, Edge_Profile_*.tgz: From July 06 to July 17: 12 files
2, Firefox_Profile_*.tgz: From July 01 to July 08: 8 files
The issues:
(1) I think the script is kind of working but I'm not sure as I've modified most part without knowing what's going on.
(2) Output to logfile:
I would prefer the exact same output in the logfile that I get with echo
, except just the filenames (not with full path), like:
rm Edge_Profile_Jul_11_2021_00-35.tgz
skipping Edge_Profile_Jul_12_2021_00-35.tgz (too new)
OS: Ubuntu MATE 21.04
Thanks a lot.
Solution 1:
Manipulating files based on their modification times is much easier in a shell that lets you access them directly. zsh is one such shell. Simply sudo apt install zsh
to install it. Since your files appear to be in one directory, this answer is non-recursive. Demonstrating the simpler case first:
-
To keep the latest copy of Firefox_Profile_*.tgz and delete rest of them no matter how old they are:
echo would rm -v -- Firefox_Profile_*.tgz(.om[2,-1])
Remove the
echo would
portion if it looks correct. This uses a glob (wildcard) qualifier inside the parenthesis to do three things:- select only plain files (not directories or sockets or etc) with
.
- order (sort) the files by their modification time, newest to oldest, with
om
- select a slice of the resulting list starting from the second element to the end -- skipping the first (newest) file, with
[2,-1]
If there are no matching files, zsh will stop and complain with "zsh: no matches found", and will not execute the
rm
. - select only plain files (not directories or sockets or etc) with
-
To keep the latest copy of Edge_Profile_*.tgz and delete the rest of them only if they are older than 5 days, first we grab the latest one:
newest=(Edge_Profile_*.tgz(.om[1]))
... and then we get the ones that are older than five days:
older=(Edge_Profile_*.tgz(.m+5))
The new part here is the
+5
on them
modifier. That selects files that are older than 5 days. After that, we make sure the newest one isn't in the list to remove:remove=("${(@)older:|newest}")
The new part here is the array subtraction symbol
:|
; it is documented in the Parameter Expansion section of the zsh manual. It selects the elements of "older" that are not in "newest". Finally, we remove that list of files:echo would rm -v -- "${remove[@]}"