SERC

Fermi cluster

Introduction :-

GPUs are highly parallel multi-core systems. The architecture of the GPUs allows execution of many concurrent threads. Parallelization of programs makes use of the multi-core architecture to get better performance throughput. This approach of using the GPU to solve general purpose problems is known as GPGPU. With the technique of GPGPU, the GPU, which generally handles computational graphics, now performs computations traditionally handled by CPUs. The GPGPUs use the massive floating point computation power as modified stream processors for non-graphics data thus making GPU a general purpose computing power.The latest addition to the GPU computing in SERC is Nvidia's Fermi architecture based C2070 Tesla card. It has ECC, Error-Correcting Code memory, a type of memory that includes special circuitry for testing the accuracy of data as it passes in and out of memory.

Fermi cluster in SERC is composed of four GPU nodes. Each GPU node is composed of one Intel Xeon W3550 processor operating at 3.06 Ghz with 16Gb RAM, one Nvidia C2070 (fermi) GPGPU card and 1TB local disk space.

This cluster is a parallel batch computing system. The cluster is managed by Torque workload manager to load balance the jobs. Job submission to the Torque batch scheduler is similar to that of PBSPro. The cluster is configured such that it can admit only GPU based jobs. Since this cluster is dedicated to GPU jobs, the user job scripts must specify number of GPUs the job intends to use. Torque considers each GPU card as a single GPU for allocation and a job at any given time can use the GPGPUs of one card only. As of now, the cluster does not permit multi-node jobs hence all jobs to this cluster must specify GPU=1 in their job scripts.

Vendor :-

1. OEM - Fujitsu
Authorised Seller - Wipro Ltd, Bangalore, India.

Hardware Overview :-

Each node of the cluster consists of

Intel Xeon W3550 processor operating at 3.06 Ghz clock speed

16 GB DDR3 Main Memory

500GB of Disk Space with 500 GB localscratch

Nvidia Tesla C2070 card

Gigabit Ethernet Connectivity

System Softwares/Libraries :-
CentOS 5.5 - Linux x86_64 Platform
GNU Compiler Collection
Application Softwares/Libraries :-
Intel C++ Compiler Professional Edition for Linux
Intel Fortran Professional Edition for Linux
CUDA Compilation Tools release-4.0 v0.2.1221
CULA
ViennaCL 1.1.2
MPICH2
   
Workload Manager :-
Torque Batch System
Location of Fermi Cluster :-

CPU Room, SERC.

DNS name of the machine :-

fermi1.serc.iisc.ernet.in

Accessing the system :-

The Fermi cluster has one login node, fermi1, through which the user can access the cluster and submit jobs. The machine is accessible for login using ssh from inside SERC network. The machine can be accessed after applying for basic HPC access, for which:

Collect and fill the HPC application form available at Room: 103, SERC or download the HPC application form here.

HPC Application form must be duly signed by your Advisor/Research Supervisor.

Helpdesk :-
For any queries, email to helpdesk@serc.iisc.ernet.in or
please contact System Administrator, #109, SERC.