LSF introduction Guide (batch job system)

LSF is available on HYDRA !!
Interactive processes are limited to 5 CPU minutes.


Table of Contents

What is LSF?

LSF - Load Sharing Facility - is a system to manage (large) programs that cannot be run interactively on a machine as they require too much CPU-time, memory or other system resources. For that reason, those large programs have to be run in batch (batch jobs).

LSF takes care of that batch management; based on the job specifications LSF will start execution of jobs when there are enough system resources available for the job to complete. Until that time, a job request will be queued.

How do I use LSF?

You need to make a small file which contains all the job specifications and the instructions to run your program (a so-called batch job file) . In fact it is similar to a shell script, except for those extra job specifications. See also the examples of several job files.

Once you have created such a job file, you have to submit it to the LSF system with the bsub command. LSF will take care of the job from thereon.

LSF commands

These are the most important LSF user commands :

bsub
Submit a batch job to the LSF system
bkill
Kill a running job
bjobs
See the status of jobs in the LSF queue
bpeek
Access the output and error files of a job
bhist
History of one or more LSF jobs
bqueues
Information about LSF batch queues

To submit a job : bsub < jobfile. By default, the job output is sent by mail.

Defining the LSF job parameters

The job parameters define the status of the job. Job parameters are recognized because they have to be written in the following way:

#BSUB-option value

as in

#BSUB-c 100

which means a per process timelimit of 100 CPU minutes.

The options are the same that can be specified as arguments to the bsub command. Here are a few examples of frequently used options (also see the machine specific limits):

#BSUB-q SMP1
Job queue; See the list of available queues
#BSUB-c 100
Time limit for the job (specified in [hour:]minutes)
#BSUB-F 100
File Size Limit for each process within the batch job (in Kbytes)
#BSUB-M 64000
Memory Size Limit for the whole job (in Kbytes)
#BSUB-S 128000
Stack segment size limit for each process in the batch job (in Kbytes)
#BSUB-D 64000
Data segment size limit for each process within the batch job (in Kbytes)
#BSUB-o filename
redirect stdout to the file filename
Add this option or the output of the job will be sent to you by email
#BSUB-e filename
redirect stderr to the file filename
#BSUB-J jobname
name of the job

You can find explanations of these and other job parameters in the bsub man-page. If you do not specify any limits, the queue's limits in which the job is submitted will be applicable.

Note that if you do not specify any of the above limits in your batch job, the maximum limit of the specified queue will be used.

LSF Limits at the Computing Centre

Following the job progress

With the command bjobs, you can examine the progress of all batch jobs. It is most frequently invoked as

bjobs -u all

With the command bpeek, you can inspect the outputfile of a specific batch job.

Examples of LSF jobs


Frequently asked questions

When I submit a job on HYDRA, it always put the job in the S queue, even when I specify another queue in the jobfile.

Make sure not to forget the < sign when using the commando bsub: bsub < jobfile


VUB/ULB Computing Centre,
Created on: 17 February 2000, Last update: 30 October 2006
Email: User Support Group.