Program to Queue Files for Copy/Move/Delete in linux?

I just wrote this simple script, which I called 'cpw', to solve this problem.

You use it just like you would use cp... the only difference is that it builds an array of any cpw processes that are already running when it is started, and waits for them to finish before passing the commands on to cp. In this way, it behaves like a self-organizing queue.

You can keep adding background cpw jobs, but they won't step on each other. They'll execute one at a time.

I'm sure others can suggest improvements.

#!/bin/bash

cpwpids=(`ps -ef | grep $USER | grep 'cpw' | grep -v grep | grep -v $$ | awk '{ print $2 }'`) #build an array of all cpw procs for this user that aren't this one.

cpwcnt=${#cpwpids[@]} # number of elemenets in the above array
cnt=$cpwcnt # counter to be decremented each pass
while [ $cnt -gt 0 ]
do
    cnt=$cpwcnt
    for i in "${cpwpids[@]}" # check if each pid has died yet
    do
        ps --pid $i >/dev/null
        if [ $? -gt 0 ]
        then
            let "cnt -= 1"
        fi
    done
    sleep 2
done
cp -v "$@" >> /tmp/cpw.log #log

Usage example:

$ cpw -R /src/tree /dest/tree &

In my experience, doing a few copies simultaneously in Linux doesn't really reduce overall throughput. My measurement of throughput is based on rsync's -P argument. My particular case is separately copying a number of folders full of large files off of a USB hard drive at the same time.

So unless you're copying a lot of things at once, you should be fine.

Since the script given by Josh Arenberg might have some deadlocking issues (which I did not experience so far, but also have not investigated), I have written up something on my own. It should not have deadlocking problems. It also works for any shell command, not just cp.

Contents of ~/bin/q

#!/bin/bash

#this waits for any PIDs to finish
anywait(){

    for pid in "$@"; do
        while kill -0 "$pid" 2&>1 >/dev/null; do
            sleep 0.5
        done
    done
}


PIDFILE=~/.q.pid

#open PIDFILE and aquire lock
exec 9>>$PIDFILE
flock -w2 9 || { echo "ERROR: flock() failed." >&2; exit 1; }

#read previous instances PID from PIDFILE and write own PID to PIDFILE
OLDPID=$(<$PIDFILE)
echo $$>$PIDFILE

#release lock
flock -u 9

#wait for OLDPID
anywait $OLDPID

#do stuff
"$@"


#afterwards: cleanup (if pidfile still contains own PID, truncate it)
flock -w2 9 || { echo "ERROR: flock() failed." >&2; exit 1; }
if [ $(<$PIDFILE) == $$ ]; then
truncate -s0 $PIDFILE
fi
flock -u 9

It creates a chain of processes, each waiting for the previous one. If a process in the middle of the chain crashes while waiting (unlikely but not impossible), the chain is broken and both parts run in parallel. The same happens if one of the processes is killed.

Usage like this:

q $COMMAND $ARGS

or even

q $COMMAND $ARGS; $ANOTHER_COMMAND $MORE_ARGS

Test e.g. by typing

q sleep 10 &
q echo blubb &

and finding that after 10 seconds blubb is printed.

Program to Queue Files for Copy/Move/Delete in linux?

Related

Recent Posts