
Table des matières
How do I transfer data between the 3 national centres (CCFR network) ?
Introduction
The CCFR (Centres de Calcul Français) network is dedicated to very high speed and interconnects the three French national computing centres: CINES, IDRIS and TGCC. This network is made available to users to facilitate data transfers between the national centres. The machines currently connected on this network are Joliot-Curie at TGCC, Jean Zay at IDRIS, and Adastra at CINES.
Using this network requires that you have logins (different for each center) in at least two of the three centres and that they are authorized to access the CCFR network in the concerned centres.
Comments:
- For your IDRIS login, the request for access to the CCFR network can be made:
- when you request to create an account from eDARI portal,
- or by sending an e-mail to from your address known to IDRIS with the title « CCFR: IDRIS login / your name ». This information will be transmitted to the two other centers so that they can make your access operational.
- Moreover, not all of the Jean Zay nodes are connected to this network. To use it from IDRIS, you can use the front-end nodes
jean-zay.idris.fr
andjean-zay-pp.idris.fr
.
For more information, please contact the User Support Team ().
Data transfers via CCFR network
Data transfers between the machines of the centres via the CCFR network constitute the principal service of this network. A command wrapper ccfr_cp
accessible via a modulefile is provided to simplify the usages:
$ module load ccfr
This ccfr_cp
command automatically recuperates the connection information of the specified machine (name domain, port number) and detects the authentication possibilities. By default, the command will opt for basic authentication, using the traditional methods in force on the targeted machine.
The ccfr_cp
command is based on the rsync
tool and configured to use the SSH protocol for transfers. The copy is recursive and keeps the symbolic links, the access rights and the dates of file modifications.
The command details and the list of the machines accessible on the CCFR network are available by specifying the -h
option to the ccfr_cp
command.
For transfers from jean-zay to CINES and TGCC machines, you can use commands similar to theses:
$ module load ccfr $ ccfr_cp /path/to/datas/on/jean-zay login_cines@adastra:/path/to/directory/on/adastra: $ ccfr_cp /path/to/datas/on/jean-zay login_tgcc@irene:/path/to/directory/on/irene:
For transfers from Adastra, the procedure is similar except that you must use the machine adastra-ccfr.cines.fr
(accessible from adastra.cines.fr
) as shown on CINES documentation.
For transfers from Joliot-Curie, the procedure is also similar and can be carried out directly from the front-end nodes irene-fr.ccc.cea.fr
. After connecting to the machine, the machine.info
command will give you all the useful information.
A ccfr_sync
command, variant of ccfr_cp
, enables a strong synchronisation between the source and the destination by adding, compared to the ccfr_cp
command, the deletion of the destination files which are no longer present in the source. The -h
option is also available for this command.
Remark: These commands will use a basic authentication with password in compliance with the terms and conditions in force at the remote centre (CINES or TGCC). You will therefore certainly be required to provide a password each time. To avoid this, you can use IDRIS transfer-only
certificates (valid for 7 days) whose instructions for use are defined on the IDRIS website. Using such certificates will force you to initiate transfers from the remote machine adastra-ccfr.cines.fr
(accessible from adastra.cines.fr
) for CINES and irene-fr.ccc.cea.fr
for TGCC after having copied the transfer-only
certificate on the remote machine and to build the rsync
transfer commands yourself (so do not use the ccfr_cp
and ccfr_sync
wrappers). You can then draw inspiration from the following examples to make your transfers:
# Simple copy from jean-zay to remote machine (initiated on remote machine) # using transfert-only certificate registered in ~/.ssh/id_ecc_rsync on remote machine $ rsync --human-readable --recursive --links --perms --times --omit-dir-times -v \ -e 'ssh -i ~/.ssh/id_ecc_rsync' \ [email protected]:/path/on/jean-zay /path/on/adastra/or/irene # Strong synchronization (--delete option) from jean-zay to remote machine (initiated on remote machine) # using transfert-only certificate registered in ~/.ssh/id_ecc_rsync on remote machine $ rsync --human-readable --recursive --links --perms --times --omit-dir-times -v --delete \ -e 'ssh -i ~/.ssh/id_ecc_rsync' \ [email protected]:/path/on/jean-zay /path/on/adastra/or/irene
Attention : On adastra-ccfr.cines.fr
, the id_ecc_rsync
certificate must be visible from your directory /home/login_cines/.ssh
so that the ssh command can use it (no environment variable is defined for this disk space). You must therefore take care to unarchive the certificate in this directory with a command like:
[email protected]:~$ unzip ~/transfert_certif.zip -d /home/login_cines/.ssh Archive: /lus/home/.../transfert_certif.zip inflating: /home/login_cines/.ssh/id_ecc_rsync inflating: /home/login_cines/.ssh/id_ecc_rsync.pub
Data transfers via parallel-sftp
To speed up file transfers between three national centers, you can also use parallel-sftp
command. It's a tool developed by CEA which can use standard sftp
command in parallel to make parallel file transfers.
You can find more information on CEA web site.
parallel-sftp
is installed on login nodes of IDRIS jean-zay cluster. Its usage is identical as sftp
, except the option -n
which let you choose the number of ssh connections used for the parallel transfers.
For example, to make a parallel transfer with 5 ssh connections:
$ parallel-sftp -n 5 <remote_login>@<remote_host>
Thus, if one sftp transfer is limited at 1Gbps for example, this transfer will use at most 5Gbps.
Warning: for your transfers between national centers, you must use the nodes connected to the CCFR network. For IDRIS, this means that the <remote_host>
parameter above must be adastra-ccfr.cines.fr
or irene-fr-ccfr.ccc.cea.fr
. But you can directly use jean-zay.idris.fr
to execute the parallel-sftp
command.
For use in IDRIS, you can follow the following example written for adastra
(for irene
you have just to use irene-fr-ccfr.ccc.cea.fr
):
# 1/ Connect to jean-zay: $ ssh jean-zay_login@jean-zay.idris.fr # 2/ Initiate connection which allows transfers with the adastra machine using # potentially 5 threads in this example $ parallel-sftp -n 5 adastra_login@adastra-ccfr.cines.fr sftp> # You are then in a sftp environment in which you can execute # put and get commands to make the transfers (see man sftp).. # WARNING: environment variables such as WORK, SCRATCH, STORE, ... # are not defined and therefore cannot be used to # define the paths to the data to be transferred. # You must indicate the full paths! # 3/ Make transfers # To transfer a directory from jean-zay to adastra, you can do: sftp> put -r /path/to/jean-zay/src/directory /path/to/adastra/dest/directory # To transfer a directory from adastra to jean-zay, you can do: sftp> get -r /path/to/jean-zay/dest/directory /path/to/adastra/src/directory