Changes and impacts related to Jean Zay H100 extension

This page gives you an overview of the operations in progress as part of putting the Jean Zay H100 extension into working order. The information given here will evolve over time and we invite you to consult it regularly.

The extension houses new front ends and compute nodes. The compute nodes have:

  • 2 Intel Xeon Platinum 8468 processors (48 cores at 2,10 GHz) or 96 cores
  • 4 Nvidia H100 SXM5 80 GB
  • 512 GB of memory

This extension includes new larger and faster disk spaces using a Lustre file system. Since mid-August, these replace the old disk spaces which used an IBM Spectrum Scale file system. The data recorded on the old file system (Spectrum Scale) has been copied/moved to the new one (Lustre) by IDRIS, for the disk spaces HOME, WORK, ALL_CCFRWORK, STORE and ALL_CCFRSTORE. On the other hand, copying temporary spaces SCRATCH and ALL_CCFRSCRATCH was the responsibility of each user.

Note that the storage volume (in bytes) of the WORK spaces has been increased on this occasion.

Important changes since October 1, 2024

Qos name changes for the A100 partition

In order to more precisely manage the ressource sharing of the machine, specific QoS have been defined for the A100 partition. If you used to explicitely specify “qos_gpu-t3” or “qos_gpu-dev” in your Slurm jobs targeting the A100 partition, you now have to use “qos_gpu_a100-t3” or “qos_gpu_a100-dev” instead. Note that the “qos_gpu_a100-t3” QoS is used by default and may be omitted.

The CPU and V100 partitions are not affected by these changes.

The on-line documentation has been updated: http://www.idris.fr/eng/jean-zay/gpu/jean-zay-gpu-exec_partition_slurm-eng.html#available_qos

Use of QoS through JupyterHub

If you wish to specify a QoS when using Slurm on JupyterHub, you now have to do it manually in the “Extra #SBATCH directives” field.

JupyterHub IP address change

The IP address of our JupyterHub instance has been modified. It is now 130.84.132.56. This change might impact you if your institution applies an IP address filtering of outgoing connections. If you run into difficulties when connectng to JupyterHub, we invite you to contact your local administrator to mention this change.

As a reminder, the set of IP addresses used for the IDRIS machines and services is the following : 130.84.132.0/23. We recommend to authorize the complete set rather than specific IP addresses so as not to be affected by potential future internal changes of our infrastructure.

Opening of the H100 partition

Users who were already granted H100 computing hours may now use them. An example submission script is as follows:

#!/bin/bash
#SBATCH --job-name=my_job            # job name
#SBATCH -A xyz@h100                  # account to use, with xyz the 3 letter code of your project
#SBATCH -C h100                      # to target H100 nodes
Example reservation of 3x24=72 CPU (for 3 tasks) and 3 GPU (1 GPU per task) on one node:
#SBATCH --nodes=1                    # number of nodes
#SBATCH --ntasks-per-node=3          # number of MPI tasks per node (= number of GPU requested per node here)
#SBATCH --gres=gpu:3                 # number of GPU requested per node (max. 4 for H100 nodes)
# Since here only one GPU per task is requested (i.e., 1/4 of the available GPUs)
# the best way to proceed is to book 1/4 of the node's CPU for each task:
#SBATCH --cpus-per-task=24           # number of CPU per task (1/4 of the CPUs here)
# /!\ Caution, "multithread" in Slurm vocabulary refers to hyperthreading.
#SBATCH --hint=nomultithread         # hyperthreading deactived
 
# to use the modules compatible with this H100 partition.
module purge
module load arch/h100
...

Note that the default modules are not compatible with the H100 partition. In order to use the software environment dedicated to this partition, you need to load the “arch/h100” module: http://www.idris.fr/eng/jean-zay/cpu/jean-zay-cpu-doc_module-eng.html#modules_compatible_with_gpu_p6_partition. This is needed for your submission scripts but also in your shell when compiling codes.

If you do not have H100 computing hours yet, your project manager may ask for supplementary hours (“au fil de l'eau”) on the eDARI portal if necessary.

Modification of access to the STORE

Modified access to the STORE went into effect on July 22nd, 2024. Therefore, access to the STORE in read/write is no longer possible from the compute nodes but it is still possible from the front-end nodes and the nodes of the “prepost”, “visu”, “compil” and “archive” partitions. We invite you to modify your jobs if you access the STORE space directly from the compute nodes. To guide you, examples have been added at the end of our documentation about multi-step jobs.

This change was made because the volume of the STORE cache (on rotating disks) will be reduced in favor of the volume of the WORK space. We will no longer be able to guarantee the redundancy of STORE data on both magnetic tapes and rotating disks, as was the case previously. The presence of the data on rotating disks allows relatively fast reading/writing. With the reduction in the volume of the cache, in some cases the data might only be stored on magnetic tape (with two copies on different tapes to guarantee data redundancy); if there were direct access to the STORE, this would significantly degrade access times to the data and consequently the performance of your calculations.

As a reminder, the STORE is a space dedicated to the long-term storage of archived data.

HOME, WORK and STORE copies

IDRIS will be responsible for copying the data stored in HOME, WORK, ALL_CCFRWORK, STORE and ALL_CCFRSTORE.

WARNING: IDRIS will simply make unmodified copies and these could cause your executions to malfunction, particularly if you use symbolic links (like those we recommend using for Python personal environments for example). In fact, these links will no longer be valid because the new directories paths are different from the olders.

Special case for the HOME

The migration of the HOME spaces was completed by 30 July 2024. We invite you to check your scripts in order to correct any possible hard-coded paths. Any path with the form /gpfs7kw/linkhome/… should become /linkhome/… or, if possible, be replaced by the usage of the $HOME environment variable. If you are using symbolic links like we recommend for personal Python environments, please redo them to reflect the new paths to the new directories.

Special case for the WORK

The migration of the WORK spaces was finished on August 13th, 2024. The Qos “qos_cpu-t4” and “qos_gpu-t4” which permit running jobs of more than 20h are functional again.

The absolute paths of the WORK spaces changed at the migration (see the new value of the variable $WORK) but to simplify the transition, symbolic links were put in place so that the former absolute paths remain functional, at least to begin with. Now that the migration has been completed, we invite you to modify any absolute paths starting with /gpfswork/… or /gpfsdswork/projects/… which could appear in your scripts (replacing them with the environment variable $WORK if possible) or in your symbolic links to never use the old directories.

Special case for the STORE

Concerning the STORE, the migration was finalized on July 25th, 2024 and now the usual $STORE variable references the new Lustre file system. A $OLDSTORE variable was created to reference the space on the old Spectrum Scale file system.

# references the STORE on the Lustre filesystem
$ echo $STORE
/lustre/fsstor/projects/rech/...
 
# references the old STORE (read-only) on the Spectrum Scale filesystem
$ echo $OLDSTORE
/gpfsstore/rech/...

WARNING: Read-only access to the old STORE space via the $OLDSTORE environment variable will be removed at the end of November 2024. The $OLDSTORE variable will then no longer be defined. Note that, read and write access to the $STORE is not possible from the compute nodes but only from the connection nodes and the nodes of the “prepost”, “visu”, “compil” and “archive” partitions. We invite you to modify your jobs if you access the OLDSTORE space directly from the compute nodes. To guide you, our documentation has been updated with examples of multi-steps jobs using the STORE.

SCRATCH and ALL_CCFRSCRATCH

Since Tuesday, September 3rd, the SCRATCH environment variable (and its variants like ALL_CCFRSCRATCH) references the new SCRATCH on the Lustre filesystem. The old SCRATCH space on Spectrum Scale filesystem is no longer accessible.

The NEWSCRATCH environment variable (and its variants) will eventually be removed so we invite you to replace it by SCRATCH as soon as possible.

# references the new SCRATCH on the Lustre filesystem
$ echo $SCRATCH
/lustre/fsn1/projects/rech/...

Important:​ If, in your HOME directory, you have directories such as $HOME/​.local and​ $HOME/​.conda pointing towards the old WORK via symbolic links with the form /​gpfswork/​…,​ it is necessary to change these links so that they point to the new WORK with the form /​lustre/​fswork/​…​.

$ cd $HOME
# Here .local is a link towards old WORK
$ ls -al .local
 ...  .local -> /gpfswork/... 
# Erase the link
$ unlink .local
# Redo the link with $WORK variable which refer the new WORK
$ ln -s $WORK/.local $HOME
# Link towards the new WORK
$ ls -al .local
 ...  .local -> /lustre/fswork/projects/rech/...