Are there cluster resource schedulers abstraction layers?

I'm writing an application that could potentially be run on any cluster resource scheduler (SGE, LSF or SLURM to name a few of them), using very basic functionalities.

I'm wondering if a framework/abstraction layer does exist for interacting with such tools in a product-agnostic manner ?


Solution 1:

The DRMAA API is supported by all of the major resource schedulers either directly or via an add-on libray. The v1 API is supported by most products but it's quite limited in scope, it basically only handles job submission and only provides a common subset of functionality. The v2 API provides functions for job control and monitoring, but as far as I know has not been widely adopted yet.

Solution 2:

No abstraction layer type software has been adopted into the mainstream of distributed computing mostly due to the fact that most clusters do not share users and resources between them. There are some exceptions for instance some universities and academic institutions employ the use of Condor to take advantage of desktop machines spread throughout a campus, but it isn't suited particularly well for some types of jobs.

Delving a little deeper though schedulers can sometimes get quite involved if your using something other than a PBS variant and even between those schedulers there are some weird inconsistencies that can arise when trying to take advantage of certain features.