- Access and accounts
- TANGO system software and logon
- Packages available
- Submission scripts
- Data storage and backups
- Contacts and help
1. Access and accounts
Through eRSA, time on the cluster is available to researchers from any of the South Australian universities. Researchers at these universities who wish to use any of eRSA’s facilities should complete the membership form.
Anyone else who is interested in using eResearch SA’s facilities should consult the Conditions of Use to determine how best to gain access to the machine.
2. TANGO system software and logon
The TANGO head node runs on a CentOS 7.3 operating system and uses the Slurm Workload Manager.
To connect to TANGO, you need to use the Unix/Linux command line. Try this cheat sheet to get you started.
Use a program called Putty
- Download the putty program - choose the MSI "Windows installer", or just putty.exe
- Once you have putty installed, follow this guide on connecting to TANGO
Linux and Mac
ssh tango.ersa.edu.au ssh USERNAME@tango.ersa.edu.au
Note: USERNAME is your eRSA HPC username.
3. Packages available
Modules (libraries and application software)
As with previous eRSA HPCs, TANGO uses modules to configure the user environment to provide access to software packages. This provides much easier access to the packages on the system. Researchers who have used Tizard will find the process much the same.
The system uses the "lmod" system to load/unload software. Please refer to the user guide here.
In the command line, to see what modules are available to be loaded (i.e. which applications are available on the cluster), type:
You can also see which modules you currently have loaded by typing:
Similarly, you can unload modules using “unload module”; for example:
module unload gaussian
will unload the Gaussian module, removing all references to the Gaussian executable and its associated runtime libraries.
If you do not see a module listed for the application that you wish to run, please contact the eRSA Service Desk.
Compilers and parallel programming libraries
The following compilers are available on TANGO, and easily accessible once you have loaded the correct module (refer to earlier section for information on modules):
- Intel Compiler Suite
- GNU Compiler (GCC, GFortran)
- Java, Python, Perl, Ruby
- OpenMPI - library for MPI message passing for use in parallel programming over Infiniband and Ethernet
Please see the guide to Compiling Programs on TANGO for further details.
4. Submission scripts
The Slurm Workload Manager is used for queuing batch jobs on TANGO. A batch job is sent to the system (submitted) with SBATCH, and comments at the start of the submission script which match a special pattern ( #SBATCH ) are read as Slurm options.
There are two aspects to a batch jobscript:
- A set of SBATCH directives describing the resources required and other information about the job; and
- The script itself, comprised of commands to set up and perform the computations without additional user interaction.
You will find two SLURM submission script templates in your home directory under a folder called .templates :
ls -lar .templates/
-rw-r--r-- 1 Owner-Acct Group-Name 526 Aug 14 10:57 tango.sub
-rw-r--r-- 1 Owner-Acct Group-Name 1011 Sep 14 13:05 tango-scratch.sub
For running batch jobs using scratch space with the tango-scratch.sub jobscript refer to the Scratch on TANGO user guide.
cp ~/.templates/tango.sub myscript.sub
### Job Name
### Set email type for job
### Accepted options: NONE, BEGIN, END, FAIL, ALL
### email address for user
### Queue name that job is submitted to
### Request nodes
echo Running on host `hostname`
echo Time is `date`
#Load module(s) if required
module load application_module
# Run the executable
Edit the highlighted jobscript entries as required for your specific job:
- All lines beginning with #SBATCH are interpreted as SLURM commands directly to the queuing system;
MyJobName should be a concise but identifiable alphanumeric name for the job (starting with a letter, NOT a number);
- ntasks=X requests the number of CPUs required for a job;
mem=Xgb states that the program will use at most X GB of memory;
- time=HH:MM:SS states the amount of "hours:minutes:seconds" walltime (realised actual time) that your job will require at most. Please contact the Service Desk if you need more than 200 hours for your job;
module load is required if you don’t automatically load the required module(s) (e.g. application or compiler) in this shell’s environment. Edit the module(s) name(s) at application_module ; and
- MyProgram+Arguments is the name of the program you want to run and all of the command line arguments you need. It may also include redirection of input and output streams.
Output and error messages will be joined into a file slurm-XXXXX.out which is placed in the directory from which the job was submitted (XXXXX will be the numerical Job ID which is allocated when you submit the job with sbatch).
5. Running jobs
Jobs on TANGO may be run in eitherbatch modeorinteractive mode
Batch jobs are run on TANGO by submitting a jobscript to Slurm.
Jobs are submitted to the queue by issuing the command:
myscript contains relevant Slurm commands and shell script commands.
Interactive jobs are typically used to step through code whilst debugging. In such cases, using only a small subset of your data reduces resource requirements and provides feedback more quickly.
There are two methods of running interactive sessions:
- If you're happy with the default resource allocation of 1 CPU with 4GB ram and a wall time of 1 hr, then you can open a bash shell on a compute node using srun:
- If you need more resources you must request a resource allocation using salloc and then use srun to launch a python environment (or any other executable e.g. bash)
[auser@tango-head-01 ~]$ srun --pty bash [auser@tango-14 ~]$ module load R/3.4.0 [auser@tango-14 ~]$ R > (R environment loaded)
[auser@tango-head-01 ~]$ salloc --nodes=2 --core-spec=8 --mem-per-cpu=32000 salloc: Granted job allocation 1396 [auser@tango-head-01 ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1397 tango tcsh auser R 1:26 2 tango-[03-04] [auser@tango-head-01 ~]$ module load python/6.3.0/2.7.13 [auser@tango-head-01 ~]$ srun --jobid=1397 --pty python Python 2.7.13 (default, May 1 2017, 12:50:43) [GCC 6.3.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> hostname = socket.gethostname() >>> print hostname tango-03
See srun --help and salloc --help for further details.
Checking a job’s status in the queue
Once a job has been submitted to the queue, it will print out a numerical Job ID. This number is helpful to make checks on the job’s status using the squeue command. Here is some sample output:
squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1234 tango TestSub auser R 2:42 1 tango-03
Deleting a queued job
To delete a queued or running job, type:
Note: You will only be able to delete your own jobs.
How much memory and virtual memory will I need?
- Please refer to your software documentation on how much memory it will require. The amount of memory you need may depend on how many CPU cores your software uses.
- You may like to run a smaller test job and check the
vmemusage in the output file that is generated.
6. Data storage and backups
Temporary storage during computation
For working space during execution, it is recommended that you use the /scratch directory, which is shared across nodes.
More information on accessing the /scratch directory can be found in the Scratch on TANGO user guide
Long term storage
Please see the storage FAQ for details.
7. Contacts and help
For more information on eRSA’s facilities, systems support, assistance with parallel programming and performance optimisation and to report any problems, contact the eRSA Service Desk.
When reporting problems, please give as much information as you can to help us in diagnosis, for example:
- When the problem occurred
- What commands or programs you were trying to execute at the time
- A copy of any error messages
- A pointer to the program you were trying to run or compile
- What compiler or Makefile you were using