About ASPIRE2A
ASPIRE2A’s core computing capabilities deliver a level of performance and flexibility needed to support a multifaceted array of HPC applications. The computational components are balanced with high speed storage subsystems and a low latency high speed interconnect that ensures to deliver the highest levels of performance across a broad spectrum of applications.
ASPIRE2A is an AMD-Based Cray EX supercomputer with 25 PB of GPFS FS and 10 PB of Lustre FS storage and Slingshot interconnect. The following diagram shows the high-level specification of ASPIRE2A.
10 Petaflops System | Over 25PB Storage | HPE Slingshot Fabric | ||
---|---|---|---|---|
| Lustre – 10PB Scratch/Buffer Filesystem
|
|
The NSCC HPC system comprises of the following:
ASPIRE2A
- HPE Cray EX 2x AMD EPYC Millan 7713 providing total compute capacity of up to 10 PFlops, 512 GB memory and 128 cores per node.
- GPU compute capability with 4 x NVIDIA A100-40G SXM per node.
- Please refer this link for more info – https://help.nscc.sg/wp-content/uploads/2024/06/ASPIRE2A-General-Quickstart-Guide-1.pdf
AI System
- Total 18 AI GPU nodes
- 12 nodes with 4x Nvidia A100 40GB and 12 TB nvme local storage.
- 6 nodes with 8x Nvidia A100 40GB and 14TB nvme local storage.
- Access to the AI systems is via ASPIRE2A “ai” queue.
High Frequency
- 16 DL385 High Frequency Nodes.
- Dual-CPU AMD 75F3 (32 cores/CPU + 32 cores/CPU = 64 cores in a node).
- 100G High speed network.
- 512GB DDR4 ECC RAM (User accessible RAM = 500 GB).
- Red Hat Enterprise Linux-8
High Speed Network Interconnect (HPE Slingshot)
- All nodes are connected with HPE Slingshot Interconnect (Dragonfly Topology).
- HPE Slingshot provides a modern, high-performance interconnect for HPC and AI clusters that delivers high-bandwidth and low-latency for HPC, ML, and analytics applications.
Other Features
- Remote extended network connections to the A*STAR, NUS, NTU, SUTD and NEA sites.
- Parallel file system (Lustre & PFSS)
- Liquid cooled high-density Cray EX cabinets.
- Air cooled racks (specialized AI, Large memory, storage, login nodes)
- Altair Workload Manager
Technical Specs
Compute Node Architecture | HPE Cray EX425 HPE Apollo 6500 Gen10 Plus System HPE ProLiant DL385 Gen10 Plus v2 server |
Processor | Dual-CPU AMD EPYC 7713, 128 cores per server. |
Total cores | 800 compute nodes with 105,984 cores (Dual CPU AMD EPYC 7713). |
Total memory | 476TByte |
GPUs | 352 nVidia A100 |
Interconnect | HPE Slingshot high speed Interconnect |
Compute performance | 10 PFlops |
Storage architecture | HPE Parallel File System Storage (PFSS). HPE Cray ClusterStor (Lustre). |
Storage capacity | 25PB PFSS for Project & Home, 10PB Lustre for Scratch |
Tier 0 storage performance | 500GB/s |