Multi-Architecture Comparison System

Isambard MACS hosts many nodes of different architectures:

Most MACS nodes run Red Hat Enterprise Linux 7 with the Cray software stack. Power 9 nodes run Red Hat Enterprise Linux (Little-Endian) 7 with IBM compilers & IBM Power AI.

All nodes are connected via 100 Gigabit EDR Infiniband. The login nodes are connected to the Internet via a 10 Gigabit link to the Janet Network.

IBM Power 9

Warning

Service unavailable as of 7 Oct 2020 both nodes have crashed and require in person intervention to be rebooted.

Isambard’s Power nodes comprise two of IBM Power System AC922 , they are representative of a type of node used in large-scale HPC. Each node has the IBM XL C/C++ (xlc) & IBM XL Fortran (xlf) compilers installed. To make full use of each node’s two Nvidia V100 “Volta” GPUs we have installed the IBM PowerAI stack for Machine Learning research.

Note

These nodes are available interactively via SSH (i.e without submitting to the scheduler). Access is available from login-01 or login-02 since these nodes are only connected the Infiniband network.

ssh power-001
ssh power-002

Hardware

There are two nodes, power-001 & power-002 - Each with two sockets of Power 9 CPU. Each socket is attached to an Nvidia V100 GPU via NVLink at aprox. 150GB/s with coherent memory access to the 280GB of system memory. There is an X-Bus link between the two GPUs.

External documentation: IBM Power System AC922 Technical Overview - IBM Redbooks

Power AI frameworks found under /opt/DL, add them to your environment by running source /opt/DL/<framework>/bin/<framework>-activate unless stated in the right-column:

Framework

Source

bazel

caffe

caffe-bvlc

caffe-ibm

cudnn

ddl

source /opt/DL/ddl/bin/ddl-pack-activate

ddl-tensorflow

hdf5

mldl-spectrum

nccl2

openblas

protobuf

pytorch

snap-ml-local

snap-ml-mpi

tensorboard

tensorflow

depends on anaconda