Log Request | Search

FAQs.

Contents

  1. ACCESSING NSCC PETASCALE SUPERCOMPUTER
  2. ACCOUNTING
  3. APPLICATIONS AND LIBRARIES
  4. JOB SUBMISSION AND SCHEDULING
  5. FILES AND FILESYSTEMS
  6. SOFTWARE
  7. FILE TRANSFER
  8. BASIC LINUX
  9. RESOURCE REQUEST, APPROVAL and ALLOCATION

ACCESSING NSCC Petascale supercomputer

NSCC Petascale supercomputer is built on x86 based architecture, with initial benchmarks of 1.01 PFLOPS of compute throughput, 13 PetaBytes tiered storage, with burst I/O of up to 500GBytes/sec. This supercomputer has 1,288 nodes and about 30,000 cores, including 128 GPU nodes, a few high memory nodes from 1TB, 2TB and 6TB RAM. We will have login nodes in NUS, NTU and A*STAR extended with DTN-Infiniband-longrange link.  
Please see the below table for details

Server CPU Model Number of Cores Number of Sockets Effective cores/server Available RAM GPUs
Standard Compute Node E5-2690 v3 @ 2.60GHz 12 2 24 128 GB No GPU
GPU compute node E5-2690 v3 @ 2.60GHz 12 2 24 128 GB One Tesla K40t
Large memory node E7-4830 v3 @ 2.10GHz 12 2 24 1TB No GPU
Large memory node E7-4830 v3 @ 2.10GHz 12 4 48 1TB No GPU
Large memory node E7-4830 v3 @ 2.10GHz 12 4 48 2TB No GPU
Large memory node E7-4830 v3 @ 2.10GHz 12 4 48 6TB No GPU

NSCC Petascale supercomputer is built on Opensource Linux platform and currently does not have any Microsoft Windows based cluster. However, if there is a specific requirement, please contact us through https://helpdesk.nscc.sg and register your request for us to work on your requirement.
NSCC enrollment system is Federated with SingAREN to facilitate users from NUS, NTU, A*Star to enroll seamlessly to NSCC System. Please follow the instructions presented in “Enrollment user guide” from “User Guide” page for the users from above mentioned organizations.
NUSEXT domain users are considered as external users for NUS. Please contact NSCC helpdesk ([email protected]) for an appropriate procedure to get authentication credentials.  
NSCC Supercomputer enrollment system is integrated with SingaAREN. For users from A*Star, NUS (except NUSEXT doman), NTU can enroll them self by by navigating through “User Services -> Enrollment”. For more information on enrollment/password reset Please follow the User Enrollment guide or Password Reset guide  and enroll yourself to NSCC System.

For non SingaAREN or commercial users (including NUSEXT domain users) has to send an email to [email protected] with the following details:

User ID (desired, preferrably same login ID which you are using in your organization, limited to 8 characters) :
First Name :
Last name :
email :
Organization name :
 
Note: Login ID creation is subjected to availability and approvals.

Please refer to the next question “How do I access NSCC?”

To utilize NSCC Petascale Supercomputer resources, you need to login to the cluster. Please follow the below table to determine the way to access NSCC Petascale Supercomputer.

SUTDWindowssutd.nscc.sgputty
MobaXterm
SSH Secure clientFileZilla
Winscp
SSH Secure client

Entity Operating System Host/FQDN Tool/Function File transfer
NUS Windows nus.nscc.sg putty
MobaXterm
SSH Secure client
FileZilla
Winscp
SSH Secure client
  Linux/Unix/MAC nus.nscc.sg Terminal/SSH FileZilla (OSX)
FileZilla (Linux)
SCP
rsync
  All https://nusweb.nscc.sg PBS Compute Manager/Display Manager PBS Compute Manager
NTU Windows ntu.nscc.sg putty
MobaXterm
SSH Secure client
FileZilla
Winscp
SSH Secure client
  Linux/Unix/MAC ntu.nscc.sg Terminal/SSH FileZilla (OSX)
FileZilla (Linux)
SCP
rsync
  All https://ntuweb.nscc.sg PBS Compute Manager/Display Manager PBS Compute Manager
ASTAR Windows astar.nscc.sg putty
MobaXterm
SSH Secure client
FileZilla
Winscp
SSH Secure client
  Linux/Unix/MAC astar.nscc.sg Terminal/SSH FileZilla (OSX)
FileZilla (Linux)
SCP
rsync
  All https://astar.nscc.sg PBS Compute Manager/Display Manager  PBS Compute Manager
  Linux/Unix/MAC stud.nscc.sg Terminal/SSH FileZilla (OSX)
FileZilla (Linux)
SCP
rsync
  All https://sutd.nscc.sg PBS Compute Manager/Display Manager  PBS Compute Manager

Direct users

VPN Guide
VPN URL

Windows aspire.nscc.sg

putty
MobaXterm
SSH Secure client

FileZilla
Winscp
SSH Secure client
  Linux/Unix/MAC aspire.nscc.sg Terminal/SSH FileZilla
SCP
rsync
  All https://aspireweb.nscc.sg

PBS Compute Manager

PBS Display manager

PBS Compute Manager

Follow the below instructions to connect to NSCC Supercomputer through portal.

  • Open the web browser and browse for the URL provided above.
  • Use the credentials of your organization to login to this portal and access NSCC Supercomputer.

Below instructions to connect using SSH/SCP on Unix/Mac PC:

  • Open the terminal
  • Type “ssh [email protected]”
  • Enter the password (characters are invisible while typing the password)
  • Once logged in successfully you should be able to see a “$” prompt, allow you to type commands

If you want to use X11 interface, replace the above ssh command with ssh -X.

For X11 – ssh -Y, make sure you installed XQuartz for OS X 10.8 or higher.

Connecting from Windows:

  • Open putty or MobaXterm
  • In case of putty, type the login host name and click on Open button (Putty may allow you to login automatically without asking for password. In case if putty asks for username/password, you need to type the login ID and password from your university.)
  • In case of MobaXterm, type to command at the prompt “ssh [email protected]”
  • Upon successful login, you will be prompted with “$” to use NSCC Supercomputer
Internet access from remote login nodes are disabled due to security reasons however, users can still download files using other methods of download.

A*Star Users – login to astar-exanet.nscc.sg and download files from internet

NUS users – ssh to login.nscc.sg and download files from internet

NTU users – ssh to login.nscc.sg and download files from internet

Direct Users – ssh to aspire.nscc.sg and download files from internet

The key generated in the enrollment portal need to be converted using openssl with the command:
“openssl rsa -in old.key -out new.key”

Then the new.key can be loaded by puttygen.
Note: this need to be done only when using puttgen.exe and when facing the error

In order to connect to NSCC login nodes stated in “How do I access NSCC?” FAQ, first you need to connect to your respective organization VPN and use your preferred method to connect.
During the alpha phase there are no restrictions, however, should you face any restrictions, please log a ticket through the service desk portal https://servicedesk.nscc.sg/ our technical specialist will investigate the reason.
Please follow the password reset guide and follow the step-by-step instructions.
Please follow the Password Reset guide and follow the step-by-step instructions.
The complexity requirement is below:
Minimum 8 characters
Mixture of Upper and Lower case
Contain Numbers  (0-9)
At least include one Special chars [email protected]#$%^&*+=?><
There are few URLs where most of the information about NSCC is available.

http://nscc.sg – Corporate Website

http://beta.nscc.sg – Technical information about NSCC Supercomputer beta phase

http://workshop.nscc.sg – All NSCC related information can be found here

https://help.nscc.sg – Information pertaining to usage of NSCC Super computer and other technical details.

ACCOUNTING

To be determined.
To be determined
To be determined
The term core or CPU is interchangeably used in many cases. For example, a server that consists of 2 sockets with each socket with 12 cores that means you get 24 computation cores to be used in the server. This means that your MPI which can use one core per process, can span 24 processes in the server.
As an example:

If you anticipate running 1,000 instances of Gromacs, Lammps, OpenFOAM, Vasp, WRF, etc. each instance using 256-cores and lasting 24 hours per job instance, then you will need 10,000 x 256 x 24 = 6.144 million core hours (or rounded up to 6.2 million).

If you plan to run bwa on 1 million genome sequences, each bwa instance running with 12 threads for 2 hours, then you will need 1,000,000 x 12 x 2 = 24 million core hours.

PBS Compute Manager is a graphical user interface which facilitates users to run, monitor, and manage the jobs with NSCC Supercomputer.
PBS Display Manager is a graphical user interface which facilitates users to run graphics intensive applications in NSCC Supercomputer.

APPLICATIONS AND LIBRARIES

The word MPI stands for Message Passing Interface. One of the most used technologies to parallelize the applications. There are various implementation of MPI such as OpenMPI, MVAPICH, IntelMPI, etc.,
MPI Interface is a wrapper to C/Fortran compiler, which means that all the code which is used in C/Fortran can be used with MPI with addition of penalization techniques
NSCC Petascale supercomputer is powered with IntelMPI for better performance of applications.
A serial application is an application which can use only one process at a time to perform calculations, while a parallel application can scale to multiple processes and make use of cores/CPUs in one or multiple servers using High speed Interconnects. For this reason, parallel applications produce faster results in general.
For example a molecular dynamics application which is serial takes 24 hours to complete a simulation, the same application parallelized with running 24 processes may complete approximately one hour.
It is strongly suggested to use parallel codes whenever it is possible.
One of the techniques to run the code multi threaded to run the code in parallel. OpenMP codes are restricted to run in one physical server. OpenMP codes does not know the way to communicate through network infrastructure such as Infiniband and hence the inter server communication is not possible. .
OpenMP code is simple to use and does not need any wrappers, standard compilers like GCC/Gfortran or Intel C/Fortran can use OpenMP, however the program must have OpenMP directives.
Checkpoint is a technique used to save the output periodically during a runtime of an application. The advantage using checkpoint is in case if the execution of program terminates abruptly due to any external reasons, it is possible to restart the program from the last checkpoint thus saving numerous computational hours.
You are advised to use the checkpoint and restart technique while writing your application so that in case of eventualities, you should be able to restart the job from the last checkpoint.
The NSCC Petascale supercomputer infrastructure is based on x86_64 architecture built with Intel processors. Traditionally, it was observed that Intel compilers provide better performance on Intel processors. The latest version of Intel compiler, Intel Parallel Studio XE_2016 is installed in the system. The compilers that can be used are icc for c/c++ codes, ifort for fortran codes, mpicc for mpi c/c++ codes, mpiifort for mpi fortran codes.
The applications, Libraries, and compilers are very dynamic and powered with Environment modules in NSCC Supercomputer, to list the available modules, use “”module avail”” from at the prompt in NSCC Supercomputer login nodes.
Yes, you can request for additional libraries or applications through the Service Desk portal. However the installation of libraries is subjected to various conditions such as compatibility, the time required to make the library available, dependencies and our software policies. Please contact Service Desk for further clarification. Our Technical specialist will respond to make the most of your request.
TBD
TBD
The commercial licensed software has restrictions on usage terms. In order to use commercial software, you need to procure the respective license. NSCC Petascale supercomputer infrastructure can help you hosting your license and restrict usage to other users. Please contact our service desk through https://servicedesk.nscc.sg for further clarifications.

JOB SUBMISSION AND SCHEDULING

There are many reasons jobs may be prevented from starting. The first thing to do is to run “qstat -s ”; this will print the comments from the job scheduler about your job.

  • If you see a “–””Q” in the column “S”, it means the scheduler has not yet considered your job. Be patient.
  • If you see “Storage resources unavailable”, it means that you have exceeded one of your storage quotas.
  • If you see “Waiting for software licenses”, it indicates that all the licenses for a software package you have requested are currently in use.
  • If you see “Not Running: Insufficient amount of resource ncpus”, it indicates that all the cpus are busy. Please be patient, PBSPro scheduling is based on resources available and request, see “Resource allocation policy” for more details.
There could be several reasons for this, few common reasons listed below:

  1. The operating system what you are using is different from what is running in NSCC Supercomputer
    Solution:

    • if you are running in your local linux machine, a simple recompilation using NSCC Login nodes will solve the issue
    • if you are running in Windows PC and you want to run in NSCC Supercomputer, you need to either obtain a copy of the software for Linux or port the code from Windows to Linux
  2. Different compiler/library stack: In case the operating system is same as NSCC but still the job is not running, which means the compiler/libraries are not compatible hence you need to recompile using NSCC Supercomputer.
  3. Inputfile/jobscript file created in Windows Machine: please see FAQ – Why am I getting ^M: bad interpreter above

If the job is still failing despite that the above conditions are satisfied, you are advised to seek the guidance of NSCC Support helpdesk by creating the ticket from this portal.

This could be due to one of several things:

  1. If you get a message in the .e file along the lines of
  2. /tbd/pbs/mom_priv/jobs/3917736.r-man2.SC: Command not found.
  3. or
    /bin/csh^M: bad interpreter: No such file or directory
    and you created your batch job script on a Windows box then you need to remove some extraneous invisible characters from the script. Say your batch job script is called runjob.sh then you should do
    dos2unix runjob.sh
    To convert the file from dos to unix format.
    There are several simple editors in NSCC Supercomputer such as vi, nano, gedit so you can create batch scripts directly.
  4. If you submit a script as an argument to qsub, check that there is a newline character at the end of the last executable line. The easiest way to do this is to simply cat the script – if the last line of the script has your shell prompt attached, edit the file to put a blank line at the end
  5. Often when using a workstation, people run their job in the background, say ./runjob &, which works fine interactively. However, when translated to a queue batch script the result is often
    #!/bin/sh
    #PBS -q normal
    #PBS -l walltime=00:10:00,mem=400MB
    ./runjob &

    This script will exit almost immediately as it is trying to run runjob in the background. Since the script exits immediately, the queue system assumes that the job is finished and kills off all user processes. Consquently, your code which runs fine interactively gets killed almost immedately out on the queue.
    There are two solutions. Try the batch job script

    #!/bin/sh
    #PBS -q normal
    #PBS -l walltime=00:10:00,mem=400MB
    ./runjob

    NOTE: the missing &
    BUT, if your runjob is itself a complicated script which starts up all sorts of program in the background try

    #!/bin/sh
    #PBS -q normal
    #PBS -l walltime=00:10:00,mem=400MB
    ./runjob
    wait

Which will tell the shell (/bin/sh) to wait until all background jobs are finished before exiting. This will prevent your background jobs from being killed and allow your program to complete.

This error indicates that you have forgotten to source the appropriate system .rc in your personal .rc file.

  • If you are using an sh-derived shell for your jobs, edit the .bashrc file to ensure it contains the line “. /etc/bashrc”.
  • If you are using a csh-derived shell for your jobs, edit the .cshrc file to ensure it contains the line “source /etc/csh.cshrc”.

There is a chance that the .bashrc file might of deleted accidentally, you may copy back the file from the skeleton file /etc/skel/.bashrc to home directory, example of command is “cp /etc/skel/.bash* ~/”

If your batch job is named, for example, runjob.sh and your output is not redirected in the batch script then your job output will appear in runjob.sh.o**** where the final digits are the job number. The final entries in the .o file give you the details on wall time and virtual memory used by the job.
If your batch script was called runjob.sh then this will be in runjob.sh.e****. There is a limit to the length of this filename so, if you have a particularly long batch script name, it may be truncated in the resulting error and output file names.
In a PBS job script, the memory you specify using the -lmem= option is the total memory across all nodes. However, this value is internally converted into the per-node equivalent, and this is how it is monitored.
For example, since NSCC Supercomputer has 24 cores per node, if you request -l slect=2:ncpus=24, mem=10GB, the actual limit will be 10GB on each of the two nodes. If you exceed this on either of the nodes, your job will be killed.
Please note, if a job runs for less than a few minutes, the memory use reported in your .o file once the job completes may be inaccurate. We strongly discourage people running short jobs of, e.g. less than 1 hour. This is because there is significant overhead in setting up and tearing down a job and you may end up wasting large amounts of your grant. Instead, if you have many short jobs to run, considering merging them together into a few longer jobs
Users may see the following message when running interactive process on login nodes:
RSS exceeded.user=abc123, pid=12345, cmd=exe, rss=4028904, rlim=2097152 Killed
Each interactive process you run on the login nodes has imposed on it a time (30mins) limit and a memory use (2GB) limit. If you want to run longer or more memory intensive interactive job, please submit an interactive job (qsub -I)
Please see the below table for details. Users only need to specify the ‘External Queue’ for job submission. Jobs will be routed to the internal queue depending on the job resource requirements.

External Queue Name Internal Queue Name Walltime Other Limits Remarks
largemem   24 hours To be decided For jobs requiring more than 4GB per core
normal dev 1 hour 2 standard nodes per user High priority queue for testing and development works
  small 24 hours Up to 24 cores per job For jobs that do not require more than one node
  medium 24 hours Up to the limit as per prevailing policies For standard job runs requiring more than one node
  long 120 hours 1 node per user Low priority queue for jobs which cannot be checkpointed
gpu gpunormal 24 hours Up to the limit as per prevailing policies For “normal” jobs which require GPU
  gpulong 240 hours Up to the limit as per prevailing policies Low priority GPU jobs which cannot be checkpointed
iworkq   8 hours 1 node per user For visualisation
To be determined.
For more information, please contact Service desk through the portal or write a mail to [email protected]
Please contact service desk through portal or email [email protected]
For automatic mailing, use the following options:
#PBS -M [email protected]
#PBS -m abe
a Send mail when job or subjob is aborted by batch system
b Send mail when job or subjob begins execution
e Send mail when job or subjob ends execution
n Do not send mail
Please Note sending emails using mail command on NSCC Supercomputer is disabled.
E : Job is exiting after having run
F : Job is finished. Job has completed execution, job failed during execution, or job was deleted.
H : Job is held. A job is put into a held state by the server or by a user or administrator. A job stays in a held state until it is released by a user or administrator.
M : Job was moved to another server
Q : Job is queued, eligible to run or be routed
R : Job is running
S : Job is suspended by server. A job is put into the suspended state when a higher priority job needs the resources.
T : Job is in transition (being moved to a new location)
U : Job is suspended due to workstation becoming busy
W : Job is waiting for its requested execution time to be reached or job specified a stagein request which failed for some reason.
X : Subjobs only; subjob is finished (expired.)
Please refer to the PBSPro manual for more details:
http://resources.altair.com/pbs/documentation/support/PBSProUserGuide12.1.pdf

We recommend using loops like this:
#!/bin/bash
#PBS …
for i in {1..10}; do
qsub -v PBS_ARRAY_INDEX=$i job-script
done
You could also run 24 single-cpu jobs in parallel if they all use similar resources and will finish around the same time by using the following in the jobscript:
#!/bin/bash
#PBS -l ncpus=24
#PBS …
for i in {1..24}; do
./run_my_program args … &
done

wait
Please note the ‘&’ at the end of the command line, and the ‘wait’ for all background tasks to finish.

FILES AND FILESYSTEMS

By default the file is restricted to individual and the project/group. In case if you need to access the files from another user. Please contact NSCC servicedesk for more information on granting the permission.
NSCC Supercomputer is powered with tiered storage, which means the files which are not used for a long time, will automatically move to a slower disk subsystem. When retrieving such files it may take a while to restore back from the slower storage. Hence there’s a delay opening your files

SOFTWARE

FILE TRANSFER

You can use one of the file transfer protocols which supports copy over SSH (e.g. scp).Please note the speed of copy may depend on the network speed of your local environment.
The easiest way to transfer files from your organization HPC is to use rsync. For example, to transfer /scratch/johnsmit/project1 directory, to NSCC /project/johnsmit/ directory you can run the following command on your cluster:
rsync -arvz /scratch/johnsmit/project1 directory login.nscc.sg: /project/johnsmit/
*Please replace login with the respective login node mentioned in the table access methods
Please use one of the tool mentioned in access methods table and connect to respective login node to start the transfer. If you intend to use rsync utility, you can use the below command syntax from NSCC Supercomputer
rsync -arvz /<projectdir>/<project name>/<username>/<copy from> myorgnizationcluster:/<my destination directory>
Please use Filezilla software to use GUI for file transfer

 

BASIC LINUX

 

RESOURCE REQUEST, APPROVAL and ALLOCATION

Upon account creation, each NSCC account is allocated with 50GB of storage and 100,000 core hours. If you require additional compute and storage resources for your project, please fill in the online form at https://user.nscc.sg/project

Note that incomplete submissions will not be processed.

Anyone with NSCC login account can submit project request and only the submitter will be able to edit his own request. 

The criteria for resource request approval include:

  1. Scientific merits
  2. Quality and completeness of application
  3. Track record of applicant (compliance with previous deliverables/projects outcomes)
  4. Alignment with national/NSCC agenda
  5. Resources requested vs stakeholder’s fair share
  6. Expected measurable outputs (e.g. number of patents, publications and postgraduate students etc.)

Projects are accorded priority in the following descending order:

  1. Individual submission
  2. Group submission
  3. Industry submission
  4. Government Agency submission

It will be one year from 1 July – 30 June.

There will be a call for resource request submissions in January each year. Approval is expected to be announced in the last week of May via email. Priority will be accorded to requests submitted during the yearly call. If you missed the yearly call, your submission will be reviewed on an ad-hoc basis.

  1. 1 – 28 Feb – Call for resource request submission
  2. 1 – 31 Mar – First round of verification by NSCC Project Admin
  3. 1 – 30 Apr – Second round of verification by TRAC
  4. 1 – 31 May – Approval by PRAC
  5. First 3 weeks of Jun – Resource provisioning by NSCC Tech Team
  6. Last week of Jun – Send Resource Request Approval notification email

Please perform the following:

  1. Login to the project portal (https://user.nscc.sg/project) and update the members section to include the userIDs to be granted permission.
  2. Email [email protected] about the request for immediate processing.
Back to Top