17.5 Managing PBS


17.5 Managing PBS

This section is intended for the PBS administrator: it discusses several important aspects of managing PBS on a day-to-day basis.

During the installation of PBS Pro, the file '/etc/pbs.conf' was created. This configuration file controls which daemons are to be running on the local system. Each node in a cluster should have its own '/etc/pbs.conf' file.

17.5.1 Starting PBS Daemons

The daemon processes (pbs_server, pbs_sched, and pbs_mom) must run with the real and effective uid of root. Typically, the daemons are started automatically by the system upon reboot. The boot-time start/stop script for PBS is '/etc/init.d/pbs'. This script reads the '/etc/pbs.conf' file to determine which daemons should be started.

The startup script can also be run by hand to get status on the PBS daemons, and to start/stop all the PBS daemons on a given host. The command line syntax for the startup script is

           /etc/init.d/pbs [ status | stop | start ] 

Alternatively, you can start the individual PBS daemons manually, as discussed in the following sections. Furthermore, you may wish to change the options specified to various daemons, as discussed below.

17.5.2 Monitoring PBS

The node monitoring GUI for PBS is xpbsmon. It is used for displaying graphically information about execution hosts in a PBS environment. Its view of a PBS environment consists of a list of sites where each site runs one or more Servers and each Server runs jobs on one or more execution hosts (nodes).

click to expand

The system administrator needs to define the site's information in a global X resources file, 'PBS_LIB/xpbsmon/xpbsmonrc', which is read by the GUI if a personal '.xpbsmonrc' file is missing. A default 'xpbsmonrc' file is created during installation defining (under *sitesInfo resource) a default site name, the list of Servers that run on the site, the set of nodes (or execution hosts) where jobs on a particular Server run, and the list of queries that are communicated to each node's pbs_mom. If node queries have been specified, the host where 'xpbsmon' is running must have been given explicit permission by the pbs_mom daemon to post queries to it; this is done by including a $restricted entry in the MOM's config file.

17.5.3 Tracking PBS Jobs

Periodically you (or the user) will want track the status of a job. Or perhaps you want to view all the log file entries for a given job. Several tools allow you to track a job's progress, as Table 17.7 shows. While the job is running, the 'qstat' command should be used to track the status of a job. However, after the job has completed, then 'tracejob' should be used.

Table 17.7: Job-tracking commands.

Command

Explanation


qstat

Shows status of jobs, queues, and servers

xpbs

Can alert user when one or more job completes

tracejob

Collates and sorts PBS log entries for specified job

17.5.4 PBS Accounting Logs

The PBS Server daemon maintains an accounting log. The log name defaults to '/usr/spool/PBS/server_priv/accounting/yyyymmdd' where yyyymmdd is the date. The file will be closed and a new one opened every day on the first event (write to the file) after midnight.

The accounting log files may be placed elsewhere by specifying the -A option on the pbs_server command line. The option argument is the full (absolute) path name of the file to be used. If a null string is given, for example

         # pbs_server -A "" 

then the accounting log will not be opened, and no accounting records will be recorded.

The accounting file is changed according to the same rules as the log files. With either the default file or a file named with the -A option, the Server will close the accounting log and reopen it upon the receipt of a SIGHUP signal. This strategy allows you to rename the old log and start recording anew on an empty file. For example, if the current date is December 1, the Server will be writing in the file '20011201'. The following actions will cause the current accounting file to be renamed 'dec1' and the Server to close the file and starting writing a new '20011201'.

         # mv 20011201 dec1         # kill -HUP (pbs_server's PID) 

17.5.5 PBS Accounting Report

The PBS administrator can use the 'pbs-report' command to generate a wide range of system, user, and job usage reports (including statistical analysis of jobs, cluster monitoring reports, etc). The program extracts data from the above-described PBS accounting logs, and performs any necessary calculations to produce the requested report. The PBS Administrator Guide includes detailed examples of the reports this command can produce.




Beowulf Cluster Computing With Linux 2003
Beowulf Cluster Computing With Linux 2003
ISBN: N/A
EAN: N/A
Year: 2005
Pages: 198

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net