Running commands in parallel with a limit of simultaneous number of commands

GNU Parallel is made for this.

seq 1 1000 | parallel -j20 do_something

It can even run jobs on remote computers. Here's an example for re-encoding an MP3 to OGG using server2 and local computer running 1 job per CPU core:

parallel --trc {.}.ogg -j+0 -S server2,: \
     'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3

Watch an intro video to GNU Parallel here:

http://www.youtube.com/watch?v=OpaiGYxkSuQ


Not a bash solution, but you should use a Makefile, possibly with -l to not exceed some maximum load.

NJOBS=1000

.PHONY = jobs
jobs = $(shell echo {1..$(NJOBS)})

all: $(jobs)

$(jobs):
    do_something $@

Then to start 20 jobs at a time do

$ make -j20

or to start as many jobs as possible without exceeding a load of 5

$ make -j -l5

One simple idea:

Check for i modulo 20 and execute the wait shell-command before do_something.


posting the script in the question with formatting:

#!/bin/bash

NUM=$1; shift

if [ -z "$NUM" ]; then
    echo "Usage: parallel <number_of_tasks> command"
    echo "    Sets environment variable i from 1 to number_of_tasks"
    echo "    Defaults to 20 processes at a time, use like \"MAKEOPTS='-j5' parallel ...\" to override."
    echo "Example: parallel 100 'echo \$i; sleep \`echo \$RANDOM/6553 | bc -l\`'"
    exit 1
fi

export CMD="$@";

true ${MAKEOPTS:="-j20"}

cat << EOF | make -f - -s $MAKEOPTS
PHONY=jobs
jobs=\$(shell echo {1..$NUM})

all: \${jobs}

\${jobs}:
        i=\$@ sh -c "\$\$CMD"
EOF

Note that you must replace 8 spaces with 2 tabs before "i=".