LION-XI User's Guide
Important Notice: This document is under construction and may be incomplete!!
Please provide feedback to
beatnic@aset.psu.edu on missing, inaccurate, or unclear information.
User Information
This guide assumes that you already have an account on LION-XI. If you do not, please see the Obtaining an
Account section of the LION-XI systems page.
- The machine name is lionxi.rcc.psu.edu.
- Connections to lionxi.rcc.psu.edu are only through the SSH (Secure Shell) protocol.
If you need an SSH client for Windows, please see the
Putty User Guide
for basic SSH terminal client and the
WinSCP User
Guide for a SSH file transfer client.
- There is a per-user hard disk quota of 8 GB.
- The default shell is bash, though tcsh and others are available. See
the /etc/shells file on lionxi.rcc.psu.edu for the current list.
Disk Policies
- Each user's home directory space is backed up nightly.
- 3.4 TB of space is available in /scratch for shared use. /scratch is for
temporary storage only and is not backed up. Note that files older than
30 days in /scratch will be deleted.
Logging onto LION-XI
You must use a SSH (Secure Shell) client to log onto LION-XI. The following deals with the standard
unix client. If you are using another client, please refer to its own documentation.
Basic usage of ssh:
ssh -l account lionxi.rcc.psu.edu
- account
- the name of your account on LION-XI
Example:
% ssh -l johndoe lionxi.rcc.psu.edu
johndoe's password: John's Password
Last login: Wed Jan 12 2000 11:30:00 -0400
No mail.
John Doe will now be logged onto LION-XI.
|
For more information on ssh, please consult either its man page or
URL http://cac.psu.edu/internet/ssh/
.
Copying files to and from LION-XI
LION-XI is not part of the AFS or DFS cells. All user code and data
must be explicitly moved to and from the LION-XI Cluster. You can use secure copy
(scp) to copy files from your computer to LION-XI and from LION-XI to your computer.
To use scp to copy files from your computer to LION-XI:
while logged onto your computer
scp filename(s) account@lionxi.rcc.psu.edu:.
- filename(s)
- file or files to copy to LION-XI
(can include wildcards)
- account
- the name of your account on LION-XI
Example:
% scp test?.c johndoe@lionxi.rcc.psu.edu:.
johndoe@lionxi.rcc.psu.edu's password: John's Password
Transfering test1.c -> lionxi.rcc.psu.edu:./test1.c (1k)
7 bytes transferred in 0.00 seconds [1.20 kB/sec].
Transfering test2.c -> lionxi.rcc.psu.edu:./test2.c (1k)
7 bytes transferred in 0.00 seconds [1.36 kB/sec].
The files test1.c and test2.c now exist in john's home
directory (/home/johndoe) on lionxi.rcc.psu.edu.
|
To use scp to copy files from LION-XI
to your computer:
while logged onto your computer
scp account@lionxi.rcc.psu.edu:filename(s) .
- account
- the name of your account on LION-XI
- filename(s)
- file or files to copy from LION-XI
(can include wildcards)
Example:
% scp johndoe@lionxi.rcc.psu.edu:test.dat .
johndoe@lionxi.rcc.psu.edu's password: John's Password
Transfering lionxi.rcc.psu.edu:test.dat -> ./test.dat (1k)
9 bytes transferred in 0.00 seconds [1.18 kB/sec].
The file test.dat now exists in the current directory of John's
computer.
|
For more information on scp, please consult its man page.
You can also use a program called sftp to copy files to and from
LION-XI if you are have ssh2.
Basic usage of sftp:
while logged onto your computer
sftp lionxi.rcc.psu.edu account
- account
- the name of your account on LION-XI
Example:
% sftp lionxi.rcc.psu.edu johndoe
local path : /home/john
johndoe's password: John's Password
remote path : /home/johndoe
sftp>put test.c
Transferring /home/john/test.c -> /home/johndoe/test.c (1k)
sftp>get test.dat
Transferring /home/johndoe/test.dat -> /home/john/test.dat (1k)
sftp>quit
The file test.c has been copied from John's home directory on his
machine to his home directory on LION-XI
and the file test.dat has been copied from LION-XI
to his machine.
|
For more information on sftp, please consult its man page.
How to Compile your Code on LION-XI
There are several compilers available on LION-XI,
including the Intel Compilers (C/C++/F77/F90/F95), Portland Group
Compilers (C/C++/F77/F90), and Pathscale compilers (C/C++/F77/F90/F95).
Information on these can be found in the
Compilers and
Programming Tools web pages.
Compiling MPI Applications
LION-XI has a high-speed Infiniband network available to run MPI
applications over. Instructions for compiling and running MPI applications
can be found in the MPICH software
page.
Running Jobs on LION-XI
All jobs on LION-XI are run through a batch queueing system called PBS.
There are two steps in submitting a job to run through PBS:
- Create a job script specifying what resources you need and what
commands you would like to be executed. This file is similar to a shell
script.
- Submit the job script to PBS. PBS will run your job when the
requested resources become available.
Creating a job script
A job script is divided into two sections: the PBS directives and the
body. Lines starting with #PBS are PBS directives. All other lines
starting with # are comments. Lines not starting with # are part of the
body and will function the same way they would if they were in a shell
script. #PBS directives specify the resources the program(s) you intend
to run will need. These resources should not exceed the queue limits.
The body of the script actually specifies how to run the program(s). It
is recommended that you use the below script as a
starting point for your script.
Queues and queue limits
The default, public queue has a 24 hour walltime limit. All partner
queues have 96 hour walltime limits by default (this can be changed upon
request by a partner group's PI). Both types of queues have a 8 node / 32 processor CPU limit.
PBS script directives
There are three script directives that you should be concerned with.
- #PBS -l nodes=x:ppn=y. The '-l nodes' directive is how
you request your nodes and processors. 'x' is the number of nodes
that you would like to use and 'y' is the number of processors that
you wish to use on each node. The number of processors on all nodes
is limited to 8. For cluster efficiency, we strongly encourage
8 processors per node if your code can take advantage of them.
- #PBS -l walltime=hh:mm:ss. The '-l walltime' directive is
how you request the walltime for your job. See the queue descriptions
for the limits. The better that you estimate your runtime, the
better your job will run. Short jobs are favored over long jobs.
- #PBS -j oe. The '-j oe' option tells PBS to put STDOUT and
STDERR into the same file.
A sample PBS script
Example:
# This is a sample PBS script. It will request 1 processor on 1 node
# for 4 hours.
#
# Request 1 processors on 1 node
#
#PBS -l nodes=1:ppn=1
#
# Request 4 hours of walltime
#
#PBS -l walltime=4:00:00
#
# Request that regular output and terminal output go to the same file
#
#PBS -j oe
#
# The following is the body of the script. By default,
# PBS scripts execute in your home directory, not the
# directory from which they were submitted. The following
# line places you in the directory from which the job
# was submitted.
#
cd $PBS_O_WORKDIR
#
# Now we want to run the program "hello". "hello" is in
# the directory that this script is being submitted from,
# $PBS_O_WORKDIR.
#
echo " "
echo " "
echo "Job started on `hostname` at `date`"
./hello
echo " "
echo "Job Ended at `date`"
echo " "
|
Note that the above example script is for a non-MPI job. Information
on how to write PBS scripts for MPI jobs can be found in the software
pages for each MPI (see the links in the "Compliling MPI Applications"
section above).
Submitting a job script to PBS:
Use qsub to submit a PBS job.
Basic syntax of qsub:
qsub script
- script
- the job script
Example:
% qsub myjob
95.lionxi.rcc.psu.edu
The job script myjob has just been submitted to PBS and has been
assigned the Job ID 95.lionxi.rcc.psu.edu. This Job ID can later be used to
control your job.
|
Checking Job Status
qstat [flags]
i.e., qstat -s (displays status and elapsed
time of job)
Example: qstat -s
lionxi.rcc.psu.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
28.lionxi.rcc. nucci lionxc submit.cmd 16743 16 -- -- 01:10 R 00:00
Job started on Fri Aug 20 at 09:37
29.lionxi.rcc. nucci lionxc submit.cmd 16784 16 -- -- 01:10 R 00:00
Job started on Fri Aug 20 at 09:38
30.lionxi.rcc. nucci lionxc submit.cmd -- 32 -- -- 01:10 Q --
Not Running: Not enough of the right type of nodes are available
Where:
- Job ID: ID of users' job in queue
- Username: job owner
- Queue: Job queue on cluster job submitted to
- Jobname: name of script submitted to PBS or STDIN if user is in an
interactive session
- NDS: number of nodes allocated
- Req'd Memory User requested memory
- Req'd Time: maximum wall clock run time
- S (State): R - Running; Q - Waiting in queue; E - Job is in error state
- Elap Time: job's current elapsed time
|
Deleting a Job from the Queue
qdel (job_id)
where job_id can be obtained from the queue listing using qstat.
Output and Error files
Unless over-ridden by options to PBS to control file output,
the standard error and output are written to two files with the names
that are of the form:
Jobname.eJob_ID
Jobname.oJob_ID
where the Jobname and Job_ID are from your job as they are
listed in the queue.
For more PBS Information
For more information on creating and submitting job scripts, consult
qsub's man page.
Please send questions or suggestions about this web page to beatnic@aset.psu.edu
ASET | ITS | Penn State
|