[Contents] [Prev] [Next] [End]


Chapter 8. Running Interactive Jobs


Interactive jobs communicate with the user in real time. Programs like vi use a text-based terminal interface; Computer Aided Design and desktop publishing applications usually use a graphic user interface (GUI).

LSF can run interactive jobs on any host in the cluster transparently. All interaction through a terminal or GUI, including keyboard signals such as CTRL-C, work the same as if the job was running directly on the user's host.

LSF supports interactive jobs in two optional ways:

This chapter contains detailed descriptions of the basic LSF tools for running jobs on remote hosts as well as ways to run interactive jobs via LSF Batch.

Note:
Interactive jobs are currently not supported on Windows NT.

Shared Files and User IDs

When LSF runs a task on a remote host, the task uses standard UNIX system calls to access files and devices. The user must have an account on the remote host. All operations on the remote host are done with the user's access permissions.

Tasks that read and write files are accessing the files on the remote host. For load sharing to be transparent, your files should be available on all hosts in the cluster using a file sharing mechanism such as NFS or the Andrew File System (AFS). When your files are available on all hosts in the cluster, you can run your tasks on any host without worrying about how your task will access files.

LSF can operate correctly in cases where these conditions are not met, but the results may not be what you expect. For example, the /tmp directory is usually private on each host. If you copy a file into /tmp on a remote host, you can only read that file on the same remote host.

LSF can also be used when files are not available on all hosts. LSF provides the lsrcp(1) command to copy files across LSF hosts. You can use UNIX pipes to redirect the standard input and output of remote commands, or write scripts to copy the data files to the execution host.

Running Remote Jobs with lsrun

The lsrun command runs a job on a remote host. The default is to run the job on the host with the least CPU load (the lowest normalized CPU run queue length) and the most available memory. Command line arguments can be used to select other resource requirements or to specify the execution host. For example, to run myjob on the best available host, enter:

% lsrun myjob

LSF automatically selects a host of the same type as the local host, if one is available. By default the host with the lowest CPU and memory load is selected.

If you want to run myjob on a host with specific resources, you can specify the resource requirements using the -R resreq option to lsrun.

% lsrun -R "swap>=100 && cserver" myjob

This command runs myjob on a host that has resource cserver (see 'Getting Cluster Information') and has at least 100 megabytes of virtual memory available.

You can also configure LSF to store the resource requirements of specific jobs, as described in 'Configuring Resource Requirements'. If you configure LSF with the resource requirements of your job, you do not need to specify the -R resreq argument to lsrun on the command line. If you do specify resource requirements on the command line, they override the configured resource requirements.

If you want to run your job on a particular host, use the -m option to lsrun:

% lsrun -m hostD myjob

When you run an interactive job on a remote host, you can use signals as if it were running locally. If your shell supports job control, you can suspend and resume the job and bring the job to background or foreground as if it were a local job.

Some jobs, such as text editors, require special terminal handling. These jobs must be run using a pseudo-terminal so that the special terminal handling can be used over the network. The -P option to lsrun specifies that the job should be run using a pseudo-terminal:

% lsrun -P vi

To avoid typing in the lsrun command every time you want to execute a remote job, you can also use a shell alias or script to run your job.

For a complete description of the command line arguments, see the lsrun(1) manual page.

Running Parallel Jobs with lsgrun

The lsgrun command allows you to run the same task on many hosts, either one after another or in parallel. For example, to merge the /tmp/out file on hosts hostA, hostD, and hostB into a single file named gout, enter:

% lsgrun -m "hostA hostD hostB" cat /tmp/out >> gout

To remove the /tmp/core file on all three hosts, enter:

% lsgrun -m "hostA hostD hostB" -p rm -r /tmp/core

The -p option tells lsgrun that the task specified should be run in parallel. If the -p argument is not given, tasks are run on each host one after another. See lsgrun(1) for more details.

The lsgrun -f host_file option reads the host_file file to get the list of hosts on which to run the task. For example, the lsgrun.wrap shell script shown in Figure 13 uses the sed command to get a list of all the hosts in the lsf.cluster.test_cluster file and then uses lsgrun to run the command given as arguments to the script on every host in the cluster.

Figure 13. lsgrun.wrap Example Shell Script

#! /bin/sh

tempfile=/tmp/lsgrun.wrap.$$ 
conffile=/usr/local/lsf/conf/lsf.cluster.test_cluster

# Note that the [ ] in the command below should contain a space and a tab 
sed -e ’1,/^HOSTNAME/d;/^End.*Host/,$d;s/[ ].*//’ < $conffile > $tempfile 
lsgrun -f $tempfile "$@"

Load Sharing Interactive Sessions

There are different ways to use LSF to start an interactive session on the best available host.

Load Sharing Login

To login to the least loaded host, the simplest way is to use the lslogin command. lslogin automatically chooses the best host and does an rlogin to that host. With no argument, lslogin picks a host that is lightly loaded in CPU, has few login sessions, and is binary compatible with the current host.

If you want to log into a host with specific resources, use the lslogin -R resreq option.

% lslogin -R "sunos order[ls:cpu]"

This command opens a remote login to a host that has the sunos resource, few other users logged in, and a low cpu load level. This is equivalent to using lsplace to find the best host and then using rlogin to log in to that host:

% rlogin `lsplace -R "sunos order[ls:cpu]"`

Load Sharing X Sessions

If you are using the X Window System, you can start an xterm that opens a shell session on the best host by entering:

% lsrun sh -c "xterm &"

In this example, no processes are left running on the local host. The lsrun command exits as soon as the xterm starts, and the xterm on the remote host connects directly to the X server on the local host.

If you are using a PC as a desk top machine and are running an X-Window server on your PC, then you can start an X session on the least loaded machine. The following steps assume you are using eXceed from Hummingbird Communications:

Note:
The '&' in this command line is important as it frees resources on the server hostA once the xterm is running.

Now, by double clicking on the 'Best' icon you will get an xterm started on the least loaded host in the cluster and displayed on your screen.

An alternative to start an X session on the best host is to submit it as a LSF Batch job:

% bsub xterm 

This starts an xterm on the least loaded host in the cluster and displays on your screen.

When you run X applications using lsrun or bsub, the environment variable DISPLAY is handled properly for you. It behaves as if you were running the X application on the local machine.

Job Starter

Some jobs have to be started under particular shells or require certain setup steps to be performed before the actual job is executed. This is often handled by writing wrapper scripts around the job. The LSF Job Starter allows you to specify an executable, which will perform the actual execution of the job, doing any necessary setup beforehand.

A Job Starter can be specified for interactive remote execution. If the environment variable LSF_JOB_STARTER is defined, the RES will invoke the Job Starter using /bin/sh with your commands as arguments as shown:

/bin/sh -c "$LSF_JOB_STARTER command [argument ...]"

where 'command [argument...]' are the command line arguments you specified in lsrun, lsgrun, or ch.

If you define LSF_JOB_STARTER environment variable as:

% setenv LSF_JOB_STARTER="/bin/csh -c"

Then you run a simple C-shell job:

% lsrun "a.out; echo hi"

The following will be invoked to correctly start the job:

lsrun /bin/sh -c "/bin/csh -c a.out; echo hi"

A Job Starter can also be defined at the queue level (see 'Using A Job Starter' of the LSF Administrator's Guide) using the JOB_STARTER parameter. It functions in a similar manner. This feature is primarily used to customize LSF for particular environments (for example, to support Atria ClearCase).

Interactive Batch Job Support

When you run interactive jobs using lsrun, lsgrun, etc., these utilities use LIM's simple placement advice for host selection. It is sometimes desirable from a system management point of view to control all workload through a single centralized scheduler, LSF Batch.

Since all interactive jobs submitted to LSF Batch are subject to policies of LSF Batch, your system will have better control. For example, your system administrator may dedicate two servers as interactive servers and disable interactive access to all other servers by defining a interactive queue that only uses the two interactive servers.

Running an interactive job through LSF Batch also allows you to take the advantage of the batch scheduling policy and host selection features for resource intensive jobs.

To submit an interactive job, you should first find out which queues accept interactive jobs by running bqueues -l command. If the output of this command contains:

SCHEDULING POLICIES:  NO_INTERACTIVE

then this is a batch only queue. If the output contains:

SCHEDULING POLICIES:  ONLY_INTERACTIVE

then this is an interactive only queue. If none of the above is defined or "SCHEDULING POLICIES" is not in the output of the bqueues -l command, then both interactive and batch jobs are accepted by this queue.

You can use LSF Batch to submit an interactive job with the bsub command. Your job can be submitted so that all input and output are through the terminal that you used to type the command.

An interactive batch job is submitted by specifying the -I option of the bsub command. When an interactive job is submitted, a message is displayed while the job is awaiting scheduling. The bsub command will block until the job completes and no mail is sent to the user by default. A user can issue a ctrl-c at any time to effectively terminate the job. For example:

% bsub -I -q interactive -n 4,10 lsmake
<<Waiting for dispatch ...>>

would start lsmake on 4 to 10 processors and display the output on the terminal.

It is possible to use the -I option together with the -i, -o, and -e options (see 'Input and Output') to selectively redirect the streams to a files. For example:

% bsub -I -q interactive -e job.err

would save the standard error stream in the 'job.err' file, while standard input and output would come from the terminal.

For jobs requiring pseudo-terminal support, bsub supports -Ip and -Is options. See the bsub(1) man page for more details.

Shell Mode for Remote Execution

Shell mode support is provided for running interactive applications through the RES or through LSF Batch. Shell mode support is required for running interactive shells or applications that redefine CTRL-C and CTRL-Z keys (for example, jove). The -S option to lsrun, ch or lsgrun creates the remote task with shell mode support. The default is not to enable shell mode support. The -Is option to bsub provides the same feature for interactive batch jobs.


[Contents] [Prev] [Next] [End]

doc@platform.com

Copyright © 1994-1997 Platform Computing Corporation.
All rights reserved.