Program to Queue Files for Copy/Move/Delete in linux?
I just wrote this simple script, which I called 'cpw', to solve this problem.
You use it just like you would use cp... the only difference is that it builds an array of any cpw processes that are already running when it is started, and waits for them to finish before passing the commands on to cp. In this way, it behaves like a self-organizing queue.
You can keep adding background cpw jobs, but they won't step on each other. They'll execute one at a time.
I'm sure others can suggest improvements.
#!/bin/bash
cpwpids=(`ps -ef | grep $USER | grep 'cpw' | grep -v grep | grep -v $$ | awk '{ print $2 }'`) #build an array of all cpw procs for this user that aren't this one.
cpwcnt=${#cpwpids[@]} # number of elemenets in the above array
cnt=$cpwcnt # counter to be decremented each pass
while [ $cnt -gt 0 ]
do
cnt=$cpwcnt
for i in "${cpwpids[@]}" # check if each pid has died yet
do
ps --pid $i >/dev/null
if [ $? -gt 0 ]
then
let "cnt -= 1"
fi
done
sleep 2
done
cp -v "$@" >> /tmp/cpw.log #log
Usage example:
$ cpw -R /src/tree /dest/tree &
In my experience, doing a few copies simultaneously in Linux doesn't really reduce overall throughput. My measurement of throughput is based on rsync's -P argument. My particular case is separately copying a number of folders full of large files off of a USB hard drive at the same time.
So unless you're copying a lot of things at once, you should be fine.
Since the script given by Josh Arenberg might have some deadlocking issues (which I did not experience so far, but also have not investigated), I have written up something on my own. It should not have deadlocking problems. It also works for any shell command, not just cp.
Contents of ~/bin/q
#!/bin/bash
#this waits for any PIDs to finish
anywait(){
for pid in "$@"; do
while kill -0 "$pid" 2&>1 >/dev/null; do
sleep 0.5
done
done
}
PIDFILE=~/.q.pid
#open PIDFILE and aquire lock
exec 9>>$PIDFILE
flock -w2 9 || { echo "ERROR: flock() failed." >&2; exit 1; }
#read previous instances PID from PIDFILE and write own PID to PIDFILE
OLDPID=$(<$PIDFILE)
echo $$>$PIDFILE
#release lock
flock -u 9
#wait for OLDPID
anywait $OLDPID
#do stuff
"$@"
#afterwards: cleanup (if pidfile still contains own PID, truncate it)
flock -w2 9 || { echo "ERROR: flock() failed." >&2; exit 1; }
if [ $(<$PIDFILE) == $$ ]; then
truncate -s0 $PIDFILE
fi
flock -u 9
It creates a chain of processes, each waiting for the previous one. If a process in the middle of the chain crashes while waiting (unlikely but not impossible), the chain is broken and both parts run in parallel. The same happens if one of the processes is killed.
Usage like this:
q $COMMAND $ARGS
or even
q $COMMAND $ARGS; $ANOTHER_COMMAND $MORE_ARGS
Test e.g. by typing
q sleep 10 &
q echo blubb &
and finding that after 10 seconds blubb is printed.