Howto set up SGE for CUDA devices?

Solution 1:

The strategy is actually fairly simple.

Using qconf -mc you can create a complex resource called gpu (or whatever you wish to name it). The resource definition should look something like:

#name               shortcut   type        relop   requestable consumable default  urgency     
#----------------------------------------------------------------------------------------------
gpu                 gpu        INT         <=      YES         YES        0        0

Then you should edit your exec host definitions with qconf -me to set the number of GPUs on exec hosts that have them:

hostname              node001
load_scaling          NONE
complex_values        gpu=2
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      NONE

Now that you've set up your exec hosts, you can request gpu resources when submitting jobs. eg: qsub -l gpu=1 and gridengine will keep track of how many GPUs are available.

If you have more than one job running per node that uses a GPU you may want to place your GPUs in to exclusive mode. You can do this with the nvidia-smi utility.

Solution 2:

Open Grid Engine added GPU load sensor support in the 2011.11 release without the need for nvidia-smi. The output of the nvidia-smi application may (and does) change between driver releases, so the other approach is not recommended.

If you have the GE2011.11 source tree, look for: dist/gpu/gpu_sensor.c

To compile the load sensor (need to have the CUDA toolkit on the system):

% cc gpu_sensor.c -lnvidia-ml

And if you just want to see the status reported by the load sensor interactively, compile with:

-DSTANDALONE

To use the load sensor in a Grid Engine cluster, you will just need to follow the standard load sensor setup procedure:

http://gridscheduler.sourceforge.net/howto/loadsensor.html

Sources:

  1. http://marc.info/?l=npaci-rocks-discussion&m=132872224919575&w=2

Solution 3:

When you have multiple GPUs and you want your jobs to request a GPU but the Grid Engine scheduler should handle and select a free GPUs you can configure a RSMAP (resource map) complex (instead of a INT). This allows you to specify the amount as well as the names of the GPUs on a specific host in the host configuration. You can also set it up as a HOST consumable, so that independent of the slots your request, the amount of GPU devices requested with -l cuda=2 is for each host 2 (even if the parallel job got i.e. 8 slots on different hosts).

qconf -mc
    #name               shortcut   type        relop   requestable consumable default  urgency     
    #----------------------------------------------------------------------------------------------
    gpu                 gpu        RSMAP         <=      YES         HOST        0        0

In the execution host configuration you can initialize your resources with ids/names (here simply GPU1 and GPU2).

qconf -me yourhost
hostname              yourhost
load_scaling          NONE
complex_values        gpu=2(GPU1 GPU2)

Then when requesting -l gpu=1 the Univa Grid Engine scheduler will select GPU2 if GPU1 is already used by a different job. You can see the actual selection in the qstat -j output. The job gets the selected GPU by reading out the $SGE_HGR_gpu environment variable, which contains in this case the chose id/name "GPU2". This can be used for accessing the right GPU without having collisions.

If you have a multi-socket host you can even attach a GPU directly to some CPU cores near the GPU (near the PCIe bus) in order to speed up communication between GPU and CPUs. This is possible by attaching a topology mask in the execution host configuration.

qconf -me yourhost
hostname              yourhost
load_scaling          NONE
complex_values        gpu=2(GPU1:SCCCCScccc GPU2:SccccSCCCC)

Now when the UGE scheduler selects GPU2 it automatically binds the job to all 4 cores (C) of the second socket (S) so that the job is not allowed to run on the first socket. This does not even require the -binding qsub param.

More configuration examples you can find on www.gridengine.eu.

Note, that all these features are only available in Univa Grid Engine (8.1.0/8.1.3 and higher), and not in SGE 6.2u5 and other Grid Engine version (like OGE, Sun of Grid Engine etc.). You can try it out by downloading the 48-core limited free version from univa.com.