[Contents] [Prev] [Next] [End]


Chapter 14. Interoperation with NQS


The Network Queuing System (NQS) is a UNIX batch queuing facility that allows users to queue batch jobs to individual UNIX hosts from remote systems. Many users have been using NQS for years.

This chapter describes how LSF works with existing NQS systems. If you are not going to use LSF to interoperate with NQS, you do not need to read this chapter.

For user sites who have been using NQS for years, LSF provides a set of NQS compatible commands for them to submit jobs to LSF using the NQS command syntax. Examples of NQS compatibility commands in LSF include qsub, qstat, and qdel.

While it is desirable to run LSF on all hosts for transparent resource sharing, this is not always possible. Some of the computing resources may be under separate administrative control, or LSF may not currently be available for some of the hosts.

An example of this are sites that use Cray supercomputers. The supercomputer is often not under the control of the workstation system administrators. Users on the workstation cluster still want to run jobs on the Cray supercomputer. LSF allows users to submit and control jobs on the Cray system using the same LSF interface as they use for jobs on the local cluster.

LSF queues can be configured to forward jobs to remote NQS queues. Users can submit jobs, send signals to jobs, check status of jobs, and delete jobs that are forwarded to the remote NQS. Although running on an NQS server outside the LSF cluster, jobs are still managed by LSF Batch in almost the same way as jobs running inside the LSF cluster.

Choosing an LSF Batch Queue

To submit jobs to hosts where NQS is running, you first need to find out which LSF Batch queues are configured to forward jobs to NQS hosts.The bqueues -l command lists detailed information about all LSF Batch queues. Queues that have the `NQS DESTINATION QUEUES' parameter defined will forward jobs to remote NQS hosts. Below is an example of the output from the bqueues command that describes such a queue:

% bqueues -l cray
QUEUE: cray
-- For jobs to be sent to the Cray supercomputer.

PARAMETERS/STATISTICS
PRIO NICE     STATUS    MAX   JL/U JL/P NJOBS  PEND  RUN  SSUSP USUSP RSV
30   15    Open:Active   5     -    -     1     0     1     0     0    0

SCHEDULING PARAMETERS
           r15s   r1m  r15m   ut    pg    io   ls   it    tmp    swp  mem
loadSched   -      -    -     -     -     -    -     -     -      -    -
loadStop    -      -    -     -     -     -    -     -     -      -    -

USERS:  all users
NQS DESTINATION QUEUES:nqs_queue@crayhost.company.com

Note that `nqs_queue' in the above output is the name of the NQS queue on the specified host.

Submitting a Job from LSF to NQS

Submitting a job to run on an NQS host is the same as submitting an ordinary LSF job, except that only those options that reflect common functionality of both LSF and NQS can be used. This is because some NQS options do not make sense in the LSF context, and many LSF options are not supported by NQS. Options must be specified as LSF options; they are automatically translated when the job is forwarded to NQS. See the LSF bsub(1) manual page and the NQS qsub(1) manual page for more information on the options supported by LSF and NQS.

Controlling Jobs Running on NQS

Job information from NQS is translated by LSF and reported by LSF Batch commands. Any signals supported by both LSF and NQS may be sent to a specified job.

Forwarding of Output Files

The stdout and stderr output of the job is always transferred from the NQS host back to the LSF cluster. If the bsub -o or -e options are not specified, the output of the job is mailed to the user. If either of the -o or -e options are specified, the output received from the NQS server is stored in the specified files.


[Contents] [Prev] [Next] [End]

doc@platform.com

Copyright © 1994-1997 Platform Computing Corporation.
All rights reserved.