Linux Cluster


Connecting to Hydra

Users may connect to hydra.cem.msu.edu (the head node) with secure shell, or ssh. Files can be transferred to/from hydra with sftp or scp. Only the head node will accept connections from the outside world. The compute nodes are on a separate, private network and will only accept connections from the head node.

Mac OS X users can use the OS X command line versions of ssh and sftp/scp. A graphical version called Fugu is available on the web.

Windows ssh/sftp clients include Secure Shell Client and Putty.

(Secure Shell Client is available on poohbah in the security share.  See Map Network/Shared Drive for instructions on how to connect to a share)

Node Naming Convention

  • Head node
    The head node is named hydra.cem.msu.edu to the outside world. The compute nodes know it as hydra.local on the internal network.
  • Regular (2-disk) compute nodes
    The 2-disk compute nodes are named compute–2d–<rack#>–<slot#>, where the rack and slot numbers start at 0 and go up. There is only one rack (number 0). There are 11 regular compute nodes, so they are named compute–2d–0–0 through compute–2d–0–10. These nodes also have a shorter nickname: c2d0–0 through c2d0–10.
  • Big (4-disk) compute node
    The 4-disk compute nodes are named compute–4d–<rack#>–<slot#>, where the rack and slot numbers start at 0 and go up. There is currently only one node of this type in the cluster, so it is named compute–4d–0–0 or c4d0–0.

Hardware

Hydra is a 13 node cluster from Western Scientific consisting of one head node, 11 regular compute nodes, and 1 "big" compute node with extra memory and disk.

Head Node
  • two dual-core AMD Opteron 265 processors, 1.8GHz, 2MB cache
  • 4GB RAM
  • two 80GB 7200 RPM SATA disks (OS), RAID 1
  • two 400GB 7200 RPM SATA disks (apps and home directories), RAID 1
  • ARECA SATA RAID controller
Regular Compute Nodes
  • two dual-core AMD Opteron 265 processors, 1.8GHz, 2MB cache
  • 4GB RAM
  • two 250GB 7200 RPM SATA disks
Big Compute Node
  • two dual-core AMD Opteron 265 processors, 1.8GHz, 2MB cache
  • 16GB RAM
  • four 250GB 7200 RPM SATA disks
  • ARECA SATA RAID controller

Software

The software listed below is installed on hydra. Click on the program names for instructions on how to run the programs.

File Systems

Head Node
  • /export
    The /export file system contains application software and user home directories. It is hardware RAID 1 (mirror) for redundancy, and it is about 384GB in size. Quotas are enabled to limit users to a fixed amount of space. Because it is NFS mounted to the compute nodes and reads/writes go over the network, it should not be used for computational scratch space.
Regular Compute Nodes
  • /scratch
    /scratch is a 2-disk software RAID 0 (striping), about 434GB in size. It is faster than a single disk, and it is local storage on each node. This is where compute jobs should put scratch files. The queue system will automatically create a temporary directory in /scratch for each job and set the $TMP environment variable to the name of the directory. When the job completes, the queue system will automatically remove the temporary directory.
     
    Whenever possible, the existing job submit scripts on hydra (gmssub, g03sub, etc.) will take the necessary steps to use the temporary directories in /scratch. If you write your own submit scripts, you should do this yourself.
Big Compute Node
  • /scratch
    On the "big" compute node, /scratch is a 4-disk hardware RAID 0 of about 893GB. Otherwise, it functions just like /scratch on the regular nodes.

Queue System

Contents:


Overview

This document describes the Sun Grid Engine 6.0 queue system on the MSU Chemistry Department Linux cluster. It provides only a brief introduction to help users get starting using SGE. For more details, consult the appropriate man pages. The sge_intro man page (type "man sge_intro") gives a brief description of all of the SGE commands.


Queue Structure

There is a single cluster queue that will accept submitted jobs and route them to available compute nodes. Because there is only one queue, there is no need to specify a queue when submitting jobs. There are no CPU, memory, disk or time limits. However, the queue system will only schedule jobs to run on processors that are idle. If there are not enough free processors to run a submitted job, the job will wait in the queue until enough free processors become available.


Submitting a Job

Jobs are submitted to SGE using the qsub command. Qsub accepts a shell script which contains the commands to be executed when the job runs. You can also instruct qsub to modify the characteristics of the job by embedding switches in the script or by placing them on the command line. There are many switches available; see the qsub man page for details.

A script file can be as simple as a single line of text containing the command to run. Here is an example script file:

a.out <file.input >file.output

This job could be submitted to the default queue with the following command (assume the script file is named "scriptfile"):

qsub scriptfile

It is important to note that SGE will start a new login session for your script. One implication of this is that your script will have its working directory set to your home directory. If your script needs to be in a different directory, you will need to add the appropriate "cd" command to your script. You could also use the "-cwd" qsub option to have it start your script from the current working directory instead of your home directory.

Jobs can also be submitted interactively with qsub. For example, instead of putting "a.out ..." in a file and then submitting that file, the job could be submitted interactively as follows:

qsub <ENTER> a.out <file.input >file.output <ENTER> <CONTROL-D>

Using the Big Node

The compute node named "compute-4d-0-0" has more memory and scratch disk space than the other nodes. To use this special node, add "-l bignode" to the qsub command. For example:

qsub -l bignode scriptfile

Submitting Parallel Jobs

SGE uses parallel environments to control the execution of parallel jobs. A parallel environment, or PE, is a collection of settings that is configured by the system administrator. These settings define parameters such as how to allocate nodes and the processors within those nodes. Several PEs are defined on hydra, but most users will need only two: mpich and g03.

Programs that use MPI, such as AMBER, should use the mpich PE. This will allow the queue system to allocate any available processors across all compute nodes. To use the mpich PE, add "-pe mpich n" to the qsub command, where n is the number of processors you wish to request. For example, to submit a job that will use 8 processors, type:

qsub -pe mpich 8 scriptfile

Note that these 8 processors could be allocated as 4 processors each on 2 nodes, or 2 processors each on 4 nodes, or in any other combination that sums to 8.

For shared memory programs like Gaussian 03, use the g03 PE. This will cause all of the allocated processors to be on the same node. Since the nodes only have 4 processor cores, you should not request more than 4 processors. If you do, your job will just sit in the queue and wait forever. To submit a shared memory 4 processor job, type:

qsub -pe g03 4 scriptfile

Submitting GAMESS Jobs

Use the following command to submit a GAMESS job:

gmssub [-b basisfile] [-m email] [-n ncpus] [qsub_args] file_name

where "file_name" is the name of your input file without the ".inp" extension. The optional qsub_args will be passed to SGE. If an email address is given, the output file will be sent to that address upon completion of the job. GAMESS can run in parallel in the cluster, and you must specify the number of processors to use on your job. If you do not wish to run your job in parallel, specify 1 processor.


Submitting Gaussian 03 Jobs

Use the following command to submit a Gaussian 03 job:

g03sub [-m email] [qsub_args] file_name

where "file_name" is the name of your input file. The optional qsub_args will be passed to SGE. If an email address is given, the output file will be sent to that address upon completion of the job.

Note for parallel use: the g03sub command will look inside your input file for the %NProc= line, and it will automatically add the correct qsub options for a parallel job. You do NOT have to use the "-pe g03" option with g03sub.


Submitting Molpro Jobs

Use the following command to submit a Molpro job:

m12sub [-n ncpus] file_name

where "file_name" is the name of your input file. To run the job in parallel, add "-n ncpus" to the command, where ncpus is the number of CPUs to use. For example, to use 4 CPUs, type the following:

m12sub -n 4 file_name

Interactive Jobs

SGE allows interactive programs to be run in the queue system. To start an interactive job, type:

qlogin

You should see some messages similar to:

waiting for interactive job to be scheduled ...
Your interactive job 1564 has been successfully scheduled.

Then you should get logged in to a compute node. At this point, your shell and every command you run will be executed under control of the queue system. When you logout of this shell, your queue session will end.


Getting the Status of a Job

Some useful commands for looking at the queues:

qstat

list all running jobs on the cluster

qstat -f

for each node, shows running jobs and how many processors are being used

qstat -g c

one line summary of processors used/available for the entire cluster


Deleting a Job

The qdel command is used to delete a job from a queue. First get the job ID number by using the qstat command, then type qdel followed by the job ID.

For example, to delete job number 28, type:

qdel 28

If you delete a running job, the queue system should kill all of the processes related to that job. However, the queue system cannot monitor certain kinds of parallel jobs. If you want to completely kill a parallel job, you should find out which nodes your job is running on ("qstat -f") BEFORE deleting it, then use qdel to delete the job. Finally, login to each of the nodes your job was running on and use the "ps" and "kill" commands to find and kill any of your remaining processes.