SERC

Tesla Cluster

Introduction :-
Tesla cluster in SERC is composed of three compute nodes. Each compute node is an SMP node built using 16 AMD-Opteron cores housed in 4-Quad-core CPUs. Each of these compute nodes is also connected to a NVIDIA-Tesla S1070 GPGPU node. Each Tesla node is composed of 4 GPUs with each GPU made up of 240 processor cores. The cluster is managed by PBSPro workload manager to distinguish and allow compute as well as GPU based jobs. Each compute job can use a maximum of 16 CPUs on this cluster since multi-node jobs are disabled. For GPU-based jobs each GPU needs a CPU-bound thread to drive the computation on it and hence the compute node CPU-resources are divided into two PBS virtual nodes namely the cpu-node and gpu-node. The jobs that get to be scheduled on these virtual nodes are identified based on the PBS job script variables as described under the workload manager section. The user needs to define appropriate PBS variables to define whether his jobs are GPU based or only CPU-based. Based on these variables PBSPro workload manager automatically routes the job into execution queues to schedule to appropriate vnodes. Each GPU is configured to be used in exclusive mode by a job and the job can use one or a maximum of 4 GPUs at a time. The compute jobs can use MPI or OpenMP based codes and the GPU jobs are built using the NVIDIA CUDA libraries.
Vendor :-
1. OEM - SuperMicro
Authorised Seller - Netweb Technologies Bangalore India.
2. OEM - Nvidia Tesla
Authorised Seller - M/S. INT Infosolution Gmbh Hamburg Germany.

Hardware Overview :-

Each node of the cluster consists of

Four AMD Quad-Core Opteron 8378 processors with 2.4Ghz clock speed.
64GB Main Memory
500GB of Disk Space with 250GB localscratch
Nvidia Tesla S1070 1U server with 4 GPU's operating at 1.296Ghz
Gigabit Ethernet Connectivity
System Softwares/Libraries :-
Fedora 10(Cambridge) Operating System
- Linux x86_64 Platform
GNU Compiler Collection
Application Softwares/Libraries :-
Nvidia Software Development Tools
Intel Software Suites
Intel C++ Compiler Professional Edition for Linux
Intel Fortran Professional Edition for Linux
MPICH2
Portland Group Inc Compilers
Workload management
Portable Batch System Professional(version 10.0)
CUDA Programming Tips for MPI Programmer
On Using Multiple CPU Threads to Manage Multiple GPUs under CUDA
Recent Activities on CUDA Programming

CUDA Workshop

Location of Tesla Cluster

CPU Room Ground Floor SERC.

Hostname of the machine

tesla1

Accessing the system

The tesla cluster has one loggable node tesla1 through which the user can access the cluster and submit jobs. The machine is accessible for login using ssh from inside SERC network. The machine can be accessed after applying for HPC access for which

Collect and fill the HPC application form available at Room: 103 SERC or download the HPC application form here.

HPC Application form must be duly signed by your Advisor/Research Supervisor.

Helpdesk
For any queries email to helpdesk@serc.iisc.ernet.in or please contact SysAdmin #109 SERC.