Submitting Jobs to the Grid

For anyone who has used the job submission commands in the past, note that we have recently migrated to a new job submission system (WMS); this is broadly similar to the old one but the commands start with glite-wms-job-* rather than edg-job-*. Other than the command names, the most noticeable difference is that it is now necessary to delegate (send) a copy of your proxy to the WMS. The examples below use the -a option to do this automatically. However, for real-world use it is better to delegate a proxy once using the command


glite-wms-job-delegate-proxy -d <delegation-id>

where the delegation-id is a user-chosen identifier for the proxy. This can then be passed to other WMS commands with the -d option.

The best reference for information on submitting jobs to the Grid is the Workload Management section of the gLite User Guide. However, below is a simple example to get you started. You can also find some more information in this introduction or in a set of slides used in a seminar in Manchester in 2008.

For each job you need a "JDL file" (JDL is the Job Description Language). To submit the simplest "hello world" job, create a file called hello.jdl with this content:


Executable = "/bin/echo";
Arguments = "Hello World";
StdOutput = "hw.out";
StdError = "hw.err";
OutputSandbox = {"hw.out", "hw.err"};

As you might expect this just echoes the string "Hello World" to stdout. The stdout and stderr streams are redirected to files called hw.out and hw.err, and at the end of the job those files will be returned in the "output sandbox", a list of files which can be retrieved once the job is finished.

If you haven't already done it, create a Grid proxy from your certificate with


voms-proxy-init --voms atlas

(replacing "atlas" with the name of your VO).

Check that everything is working with


glite-wms-job-list-match -a hello.jdl

which should produce a list of sites to which the job may be sent. Then submit the job:

glite-wms-job-submit -a -o /tmp/hello.jid hello.jdl

where hello.jid is a temporary file containing the job ID - if you submit further jobs using the same file the IDs will be appended. You can check on the status of the job with

glite-wms-job-status -i /tmp/hello.jid

Due to various overheads in the system even a short job like this will take a few minutes, and potentially a lot longer if there are no free resources and the job has to be queued, but if all goes well the status should eventually say "Done (Success)". At that point you can retrieve the job output (the stdout and stderr files in this case) with

glite-wms-job-output -i /tmp/hello.jid

The output is stored in a directory which is configured locally, often in /tmp, or you can supply your own with the --dir option. In any case the above command will print the location. You can then examine the files to check that the output is correct.

The job above just executes a pre-installed command (/bin/echo), but in most cases you will want to ship a script and possibly other files with the job. These are sent in the so-called input sandbox. As a simple example, create a file called hello2.txt containing the string "Hello World". Then create a file hello2.sh:


#!/bin/bash

cat $1

Finally, create a new JDL file hello2.jdl:

Executable = "hello2.sh";
Arguments = "hello2.txt";
InputSandbox = {"hello2.sh", "hello2.txt"};
StdOutput = "hw.out";
StdError = "hw.err";
OutputSandbox = {"hw.out", "hw.err"};

and submit it as described above. If all goes well this should send the files hello2.txt and .sh with the job, set hello2.sh to be executable, and run the command "hello2.sh hello2.txt" which will again print the string "Hello World" to stdout.

One thing to note is that permissions on the sandbox files are not preserved. A file named in the Executable field in the JDL will have the x bit set, but any other files will have it cleared.

You should also be aware that sandboxes are for small files, up to a few kb - resource brokers will generally limit the maximum size. Larger files should be accessed via the data management system.

The examples above have jobs which take a very short time to run. However, for real jobs you need to take into account that batch queues have time limits. This can be managed by adding a Requirement to the JDL, which specifies constraints on the site and queue used to run the job. This can be quite complex and you should consult the User Guide for full details. However, as a simple example you can specify a minimum CPU time of an hour with a JDL line like:


Requirements = other.GlueCEPolicyMaxCPUTime > 60;
Note that the time limits are in minutes, although the Estimated/WorstResponseTime values are in seconds.


Last modified Fri 18 November 2011 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3
For more about GridPP please contact Neasan O'Neill