Manifold User's Guide
Kenneth L. Smith <ksmith@gravity.psu.edu>
Revision: 1.4
6 November 2002
Manifold is a 40-node cluster, built on the Intel®
Xeonarchitecture, to be used for Numerical Relativity
simulations. This document is designed to aid a new user of Manifold in
understanding the layout of the machine and hopes to get him or her up and
running with minimal fuss.
See Pablo Laguna <pablo@astro.psu.edu> or
Bernd Brügmann <bruegman@gravity.psu.edu> to obtain an account on Manifold .
All access into Manifold from the outside will be via ssh. All other
access is disabled. Internally, the server is known as ``head'' and all
nodes are named sequentially, ``node001'', ``node002'', etc. Access between
the master and slave nodes is provided by rsh, although ssh is
also available.
Those with accounts on any of the NR workstations may also access
the contents of the home and bulk directories across NFS (see below).
This section will detail the hardware of the machine (just in case anyone
is interested)
Your home directory will be /home/manifold/user. This directory is
NFS shared to all nodes, and to NR workstations. It will be backed up (but
as of 30 Oct 2002, it is not). Please observe the policy that your home
directory be used to hold the files and directories deemed necessary - i.e.
those which you would mind losing in the event of a catastrophe.
Simulations should NEVER be run from within your home directory.
In your home directory you will find a symbolic link ``bulk''. This is a link
to /bulk/manifold/user. This filesystem corresponds to our
terabyte (1 TB) RAID array, and is also NFS shared to all nodes and NR
workstations. This partition will never be backed up, but the redundancy
inherent to the RAID (Level 5) means that data stored here should not be lost.
All simulations should be started somewhere under your bulk directory.
All NR-related application and packages are installed in /usr/nrlocal. It is
here that you may find the Lahey/Fujitsu Fortran compiler, the Intel compilers
for Linux, Maple, OpenOffice 1.0, and so on.
Cluster-specific applications, packages, and libraries are found in
/usr/beowulf. Most important here is the MPICH
(http://www-unix.mcs.anl.gov/mpi/mpich/) set of executables and libraries
necessary to run parallel code on Manifold . The use of MPICH to compile and
run MPI code will be explained further below.
MPI is a widely-accepted library standard for message-passing. Two of the most
common implementations of MPI are MPICH
(http://www-unix.mcs.anl.gov/mpi/mpich) and LAM
(http://www.lam-mpi.org). Because of its support for Myrinet, Manifold
uses MPICH.
In /usr/beowulf, you will find a subdirectory for each variant of MPICH
and the C++/Fortran compilers available. The convention is that the version of
MPICH compiled for a regular Ethernet interface is named
mpich.compiler1and the version compiled for the
Myrinet interface is named mpich-gm.compiler1.
Suffice it to say that most users will want to use the Myrinet version; the
Ethernet is there for debugging.
The available options for compiler are:
- gcc
- : GNU Compiler Collection v2.96 (gcc,g++,g77)
- lahey
- : Lahey/Fujitsu Fortran Express v6.0 F77/F90/F95 compiler (lf95) w/ gcc/g++ for C/C++
- intel
- : Intel Fortran and C/C++ compiler v6.0 (ifc,icc)
Ex. Suppose you wish to compile with the MPICH libraries
for Myrinet, using the Intel compiler. Then you may have something like the
following, but this is purely schematic2:
MPICH_DIR = /usr/beowulf/mpich-gm.intel
MPICH_LIB_DIR = ${MPICH_DIR}/lib
MPICH_INC_DIR = ${MPICH_DIR}/include
GM_LIB_DIR = /usr/gm/lib
LIB_DIR = ${LIB_DIR} ${MPICH_LIB_DIR}
LIBS = ${LIBS} -lmpich -lgm
For the Cactus users in the crowd, there are a few ``config'' files available.
These can be found on the web at
http://www.astro.psu.edu/nr/computers/manifold/share/cactus-cfg or on
Manifold in /usr/beowulf/share/cactus-cfg3.
You can run code in parallel by using a variant of the ``mpirun'' command.
As part of the login scripts, /usr/beowulf/bin is added to your path. In
/usr/beowulf/bin, you'll find several shell scripts useful for performing
common tasks on the cluster such as issuing a command on all nodes, or
seeing which nodes are up. You'll also find there executables named
empirun.compiler and mmpirun.compiler. The
`e' or `m' indicates Ethernet or Myrinet interfaces. As before, the
compiler depends upon your choice. You should use the same interface
and compiler choice to run your code as you used to compile it.
Ex. Adding to the earlier example, you compiled your code
with the MPICH libraries for Myrinet, using the Intel compiler. To run your
code, use4:
{/usr/beowulf/bin/}mmpirun.intel -np X mycode
Options for the mpirun command may be found at
http://www.astro.psu.edu/nr/computers/manifold/man/mpich-www/www1/mpirun.htmlor by running any (e|m)mpirun.compiler with the -h
argument.
The full online documentation provided by MPICH is available for reference on
the Numerical Relativity website at
http://www.astro.psu.edu/nr/computers/manifold/man/mpich-www.
The Portable Batch System (http://www.openpbs.org) is a ``flexible batch
queuing and workload management system''. It will control the finer aspects
of negotiating the users' requests for resources and attempt to ensure that
the resources are used to the maximum extent possible. While compilation and
interactive jobs may be run on certain designated nodes (currently just 'head'),
the remaining nodes will only be available via the queuing system.
At the moment, there is only a single execution queue ``default''. This
will change within a relatively short time once a policy has been found. At
this time, when submitting your jobs, you may either specify the queue
explicitly as ``default'', or omit any queue request.
The most common way a user will submit a job is by using a batch script. A
PBS batch script is just a regular shell script with PBS commands embedded
as comments.
See the example file
http://www.astro.psu.edu/nr/computers/manifold/share/example.pbs(which is also available on Manifold in /usr/beowulf/share/). The
options which one can specify in a PBS script are the same as those one
can specify to the submission program, qsub, on the command line.
Please refer to `man qsub` or the online version provided at
http://www.astro.psu.edu/nr/computers/manifold/man/pbs/qsub.1.htmlfor further information and options.
As a quick reference, here are a few of the more commonly used options:
- -l nodes=x:ppn=y
- Specify the number of nodes x and the number of
processors per node that you want for your job
- -l walltime=hh:mm:ss
- Specify that you expect your job to last for
hh:mm:ss. It's best that you make this estimate as realistic as possible for
efficient scheduling.
- -q queuename
- Specify to which queue you wish to submit your job
- -j oe
- Specify that you wish to (j)oin stdout and stderr into one file
- -M user@domain.name
- Specify to what address email notifications will be
sent
- -m be
- Specify that the user above will be sent an email at the
(b)eginning and (e)nd of the job.
Once you have tailored a PBS batch script to your specific application, you
may submit it to the queue via the command:
qsub batch-script
You will immediately receive feedback to stdout with the name of your job:
jobid.manifold.astro.psu.edu
where jobid is a unique integer identifier attached to your
job for the extent of its execution. Unless you specified the `-k'
option in your batch script, the output from stdout and stderr will not be
available until the completion of your jobs.
Much like sending a job to a print queue, you'll find that for some jobs
you'll want to kill them, others you'd just like to monitor, some should be
modified in situ. PBS offers you the ability to perform these tasks
with the collection of `q' commands:
- qalter
- Alter a job's attributes.
- qdel
- Delete a job.
- qhold
- Place a hold on a job to keep it from being scheduled for running.
- qmove
- Move a job to a different queue or server.
- qmsg
- Append a message to the output of an executing job.
- qrerun
- Terminate an executing job and return it to a queue.
- qrls
- Remove a hold from a job.
- qselect
- Obtain a list of jobs that met certain criteria.
- qsig
- Send a signal to an executing job.
- qstat Show status of PBS batch jobs.
For the more graphically inclined out there, essentially all of the above
tasks from job submission to job alteration can be performed with two
graphical tools (written in Tcl/Tk). Between them, xpbsmon is the
most useful for general use to see what nodes are available and what jobs
are currently running. You can typically think of it as an alternative to
qstat.
- xpbs
- GUI front end to PBS commands
- xpbsmon
- GUI for displaying, monitoring the nodes/execution hosts under PBS
The man pages for the user-level commands of PBS have also been provided
on the Numerical Relativity website at
http://www.astro.psu.edu/nr/computers/manifold/man/pbs.
Footnotes
- ...mpich.compiler1
- Actually, /usr/beowulf/mpich[-gm].compiler is a
symbolic link to /usr/beowulf/mpich-gm.X.YY.Z.compiler where
X.YY.Z is the current version of MPICH (Currently 1.2.4 for MPICH and 1.2.4..8a
for MPICH-GM). The user should always use the directory without the version
information as it is guaranteed to always point at the correct up-to-date
version.
- ... schematic2
- The gm libraries will be necessary to use the
Myrinet interface
- .../usr/beowulf/share/cactus-cfg3
- By ``config'' files, we mean those which one uses to set
compiler, library, and path variables before configuration via the syntax
(g)make <config-name>-config options=<config-file>.
- ... use4
- As /usr/beowulf/bin has been added to your path,
specifying it explicitly is superfluous.
Kenneth Smith
2002-11-06