How to reserve complete nodes on Sun Grid Engine?

How do you use SGE to reserve complete nodes on a cluster?

I don't want 2 processors from one machine, 3 processors from another, and so on. I have a quadcore cluster and I want to reserve 4 complete machines, each having 4 slots. I cannot just specify that I want 16 slots because it does not guarantee that I will have 4 slots on 4 machines each.

Changing the allocation rule to FILL_UP isn't enough because if there are no machines that are completely idle, SGE will simply "fill up" the least loaded machines as much as possible instead of waiting for 4 idle machines and then scheduling the task.

Is there any way I can do this? Is there a better place to ask this question?


SGE is weird with this, and I haven't found a good way to do this in the general case. One thing that you can do, if you know the memory size of the node you want, is to qsub while reserving an amount of memory almost equal to the full capacity of the node. This will ensure it grabs a system with nothing else running on it.


I think I found a way, but it probably doesn't work on old SGE's like mine. It seems new version of SGE has exclusive scheduling built in.

https://web.archive.org/web/20101027190030/http://wikis.sun.com/display/gridengine62u3/Configuring+Exclusive+Scheduling

Another possibility I've considered, but quite error prone, is to use qlogin instead of qsub and manually reserve 4 slots on each desired quadcore machine. Understandably, automating this is not particularly easy or fun.

Lastly, maybe this is a situation where hostgroups can be used. So for example, creating a hostgroup with 4 quadcore machines in it and then qsubbing to this specific subset of a queue, requesting a number of processors equal to the maximum total number in the group. Unfortunately this is kind of like hardcoding and has a lot of drawbacks eg having to wait for people to vacate a particular hardcoded hostgroup and requiring changes if you want to switch to 8 instead of 4 machines etc.


It seems like there is this hidden command-line request to add:

-l excl=true

But you have to configure it into your SGE or OpenGridScheduler by adding it to the list of complex values (qconf -mc) and enabling each individual host (qconf -me hostname)

see this link for details: http://web.archive.org/web/20130706011021/http://docs.oracle.com/cd/E24901_01/doc.62/e21978/management.htm#autoId61

In summary:

type:

qconf -mc

and add the line:

exclusive    excl      BOOL      EXCL   YES          YES          0        1000

then:

qconf -me <host_name>

and edit then complex_values line to read:

complex_values        exclusive=true

If you have any host-specific complex_values already in there, then just comma separate them.


I'm trying to do almost exactly the same thing and am looking for ideas. I think a pe_hostsfile is the best option, but I'm not a manager of our SGE system, and there's no hosts files configured, so I need a quick work around. Just checked out the Configuring Exclusive scheduling link, and see that that also requires managerial rights...

I think a wrapper script could do it. I wrote a bash one-liner to figure out the number of available cores left on a machine (below). Our grid is heterogeneous, with one node having 24 cores, some 8, and the majority only 4, which makes things a little awkward.

Here's that bash one-liner anyway.

n_processors=`qhost | awk 'BEGIN{name="'\`hostname\`'"} ; {if($1==name){print int($3)-int($4+0.99)}}'`

Problem now is how to get this bash variable into a SGE startup script preprocessing directive?? Maybe I'll just provide the below arg in my shell script, as the pvm environment ships with SGE. Doesn't mean it's configured though...

#$ -pe pvm 24-4

Sun's page on Managing Parallel Environments is pretty helpful, although again, the instructions are mostly aimed for administrators.


We set the allocation rule to the number of slots available on the node (in this case, 4). This means you can only start jobs with n*4 CPUs, but it will achieve the desired result: 16 CPUs will be allocated as 4 nodes with 4 CPUs each.