Run several jobs parallelly and Efficiently
OS: Cent-OS
I have some 30,000
jobs(or Scripts) to run. Each job takes 3-5 Min. I have 48 CPUs(nproc = 48)
. I can use 40 CPUs to run 40 Jobs
parallelly. please suggest some script or tools can handle 30,000 Jobs by running each 40 Jobs parallely.
What I had done:
I created 40 Different folders and executed the jobs parallely by creating a shell script for each directory.
I want to know better ways to handle this kind of jobs next time.
Solution 1:
As Mark Setchell says: GNU Parallel.
find scripts/ -type f | parallel
If you insists on keeping 8 CPUs free:
find scripts/ -type f | parallel -j-8
But usually it is more efficient simply to use nice
as that will give you all 48 cores when no one else needs them:
find scripts/ -type f | nice -n 15 parallel
To learn more:
- Watch the intro video for a quick introduction: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
- Walk through the tutorial (man parallel_tutorial). You command line with love you for it.
Solution 2:
I have used REDIS
to do this sort of thing - it is very simple to install and the CLI is easy to use.
I mainly used LPUSH
to push all the jobs onto a "queue" in REDIS and BLPOP
to do a blocking remove of a job from the queue. So you would LPUSH
30,000 jobs (or script names or parameters) at the start, then start 40 processes in the background (1 per CPU) and each process would sit in a loop doing BLPOP
to get a job, run it and do the next.
You can add layers of sophistication to log completed jobs in another "queue".
Here is a little demonstration of what to do...
First, start a Redis server on any machine in your network:
./redis-server & # start REDIS server in background
Or, you could put this in your system startup if you use it always.
Now push 3 jobs onto queue called jobs:
./redis-cli # start REDIS command line interface
redis 127.0.0.1:6379> lpush jobs "job1"
(integer) 1
redis 127.0.0.1:6379> lpush jobs "job2"
(integer) 2
redis 127.0.0.1:6379> lpush jobs "job3"
(integer) 3
See how many jobs there are in queue:
redis 127.0.0.1:6379> llen jobs
(integer) 3
Wait with infinite timeout for job
redis 127.0.0.1:6379> brpop jobs 0
1) "jobs"
2) "job1"
redis 127.0.0.1:6379> brpop jobs 0
1) "jobs"
2) "job2"
redis 127.0.0.1:6379> brpop jobs 0
1) "jobs"
2) "job3"
This last one will wait a LONG time as there are no jobs in queue:
redis 127.0.0.1:6379> brpop jobs 0
Of course, this is readily scriptable:
Put 30,000 jobs in queue:
for ((i=0;i<30000;i++)) ; do
echo "lpush jobs job$i" | redis-cli
done
If your Redis server is on a remote host, just use:
redis-cli -h <HOSTNAME>
Here's how to check progress:
echo "llen jobs" | redis-cli
(integer) 30000
Or, more simply maybe:
redis-cli llen jobs
(integer) 30000
And you could start 40 jobs like this:
#!/bin/bash
for ((i=0;i<40;i++)) ; do
./Keep1ProcessorBusy $i &
done
And then Keep1ProcessorBusy
would be something like this:
#!/bin/bash
# Endless loop picking up jobs and processing them
while :
do
job=$(echo brpop jobs 0 | redis_cli)
# Set processor affinity here too if you want to force it, use $1 parameter we were called with
do $job
done
Of course, the actual script or job you want run could also be stored in Redis.
As a totally different option, you could look at GNU Parallel
, which is here. And also remember that you can run the output of find
through xargs
with the -P
option to parallelise stuff.