ATOMIN Cluster Software Info
COMPILERS AND OTHER SOFTWARE STUFF
There are, of course, standard linux GNU compilers version 6.3 available.
A lot of other software such as Matlab, Mathematica, GSL and some usefull
libraries can also be found. If something is not installed, please ask the
administrator. Matlab is in R2016b version as the newer one cannot work
headless at all!
The same software is installed on all machines including: MKL - Intel Math
Kernel Library (a library containing among other things highly optimized lapack
and fftw3) and Intel MPI implementation. Those two are licenses free
libraries provided by Intel and they are in /opt/intel/mkl and /opt/intel/mpi
directories respectively.
OPENMP
All GNU compilers support multi-threading via the OPENMP extension, which
enables parallel (multithreaded) jobs within each node - up to 96 (64) threads.
Some trial and error testing is needed to check if the most efficient set-up is
achieved with a bit less threads than the number of available computing cores
(operating system needs sometimes quite a lot of computing power, specially with
active glusterfs connections).
INTEL MPI
Openmp is contained within single node - if even more computing power is needed
then one can perform parallel jobs via MPI (Message Passing Interface) which
should also perform well since our cluster is connected via both Ethernet and
Infiniband that delivers our fast interconnect. Intel MPI has been tested and
found out to behave quite well. It can be found in /opt/intel/mpi directory.
For jobs demanding more than a single machine Intel MPI requires a file called
mpd.hosts specifying the nodes on which the program should be run. It should
contain a list of nodes with appropriate number of processes in the form
similar to:
complex01:96
complex02:96
complex03:96
complex04:96
complex05:96
complex06:96
complex07:64
complex08:64
This file needs to be created in the working directory of the program, i.e.,
the one from which mpirun is invoked. There is also an option to the mpirun
command which enables usage of other than default file. In multinode case one
also needs to define communication channel, for Intel MPI and our cluster it
is "-r ssh". Another parameter useful for communication optimization
"-genv I_MPI_DEVICE rdssm" indicates a mixed (hybrid) communication between
cores (shared memory + Infiniband).
The full command running such MPI job (still with Intel MPI) is:
/opt/intel/mpi/intel64/bin/mpirun -r ssh -genv I_MPI_DEVICE rdssm -np
total_core_number ./program_name
PBS ADVICE: as queue system assigns the nodes dynamically, the file mpd.hosts
cannot be created in advance. Instead the system creates a file containing all
assigned nodes and passes the name of the file in the environment variable
PBS_NODEFILE - this file needs to be used instead of the default mpd.hosts or,
alternatively, one may copy the file to the mpd.hosts within the batch script
before running mpirun.
OTHER MPI SOFTWARE
All other MPI related software (libraries etc.) may be listed (and also
managed) via:
mpi-selector --list
the description of all that software is way to big to include here, for the
details contact an administrator and be prepared for LONG reading.
QUEUE SYSTEM (SLURM)
At the moment the following queues are defined:
Q.name Node no. No. of cores per node RAM available per node Walltime
bigone 1-6 96 256 GB Infinite
small 1-2 64 128 GB Infinite
The "small" queue is served by complex07 and 08 only, whereas complex01 to
complex06 serve the bigone.