Environnements Singularity

Singularity est un environnement de "conteneurisation" permettant de virtualiser un système d'exploitation via une image crée pour ce besoin :(operating-system-level virtualization also known as containerization).
Cela permet d'installer des modules indépendamment du système de base sur lequel fonctionne le calculateur, et de déployer un environnement "ready to use" pour certaines utilisations telles l'IA.
L'outil est utilisé pour permettre la mise en place d'images TensorFlow.

Tensorflow

[homer@vision]$ module av tensorflow/
-------------------------------------- /zfs/softs/modulefiles --------------------------------------
tensorflow/sing_20.10-tf1-py3  tensorflow/sing_20.10-tf2-py3

Références

Utilisation

Mode interactif

  1. Lancement d'un job interactif avec qsub -I
  2. Chargement du module module add tensorflow/sing_20.10-tf2-py3 (ou tensorflow/sing_20.10-tf1-py3)
  3. Une fois le module chargé, des variables sont définies pour manipuler l'environnement :
  • $SING_IMAGE : montre le chemin vers l'image utilis
  • $SING_INSPECT : montre les informations sur le conteneur
  • $SING_EXEC : permet de lancer une commande à travers le conteneur. Ex : $SING_EXEC ./tensorflow/tf2-simple.py ; $SING_EXEC ls -lhrt
  • $SING_SHELL : permet de charger l'environnement et son shell.
  1. Travail dans l'environnement
  2. ...
Lancement d'un job interactif avec qsub -I et chargement du module
[homer@vision FIDLE]$ qsub -I -q gpuq -l select=1:ncpus=1:ompthreads=24:mem=24GB -l walltime=00:30:00
qsub: waiting for job 13181.vision to start
qsub: job 13181.vision ready

cd /zfs/home/gueguenm/pbs.13181.vision.x8z

[homer@vision ~]$ cd /zfs/home/gueguenm/pbs.13181.vision.x8z
[homer@vision pbs.13181.vision.x8z]$ cd $PBS_O_WORKDIR
[homer@vision]$ module add tensorflow/sing_20.10-tf2-py3
Loading tensorflow/sing_20.10-tf2-py3
  Loading requirement: singularity/3.6.4
[homer@vision]$ singularity --help

Linux container platform optimized for High Performance Computing (HPC) and
Enterprise Performance Computing (EPC)

Usage:
  singularity [global options...]

Description:
  Singularity containers provide an application virtualization layer enabling
  mobility of compute via both application and environment portability. With
  Singularity one is capable of building a root file system that runs on any
  other Linux system where Singularity is installed.
Visualisation des infos du conteneur
[homer@vision fidle]$ $SING_INSPECT
Cuda: = 11.1.0
DALI: = 0.26
DLProf: = 0.16.0
Horovod: = 0.20.0
JupyterLab: = 1.2.14
NCCL: = 2.7.8 (optimized for NVLink™ )
Nsight_Compute: = 2020.2.0.18
Nsight_Systems: = 2020.3.4.32
OpenMPI: = 3.1.6
Python: = 3.6
TensorBoard: = 1.15.0+nv
TensorRT: = 7.2.1
Ubuntu: = 18.04
cuBLAS: = 11.2.1
cuDNN: = 8.0.4
org.label-schema.build-date: Friday_6_November_2020_15:47:56_CET
org.label-schema.schema-version: 1.0
org.label-schema.usage: /.singularity.d/runscript.help
org.label-schema.usage.singularity.deffile.bootstrap: docker
org.label-schema.usage.singularity.deffile.from: nvcr.io/nvidia/tensorflow:20.10-tf1-py3
org.label-schema.usage.singularity.runscript.help: /.singularity.d/runscript.help
org.label-schema.usage.singularity.version: 3.6.4
Lancer le shell associé et les commandes python/ipython
[homer@vision fidle]$ $SING_SHELL
Singularity> python
Python 3.6.9 (default, Oct  8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
Singularity> ipython
Python 3.6.9 (default, Oct  8 2020, 12:12:24)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.16.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: ls
AE/            MNIST/          VAE/
BHPD/          MNIST_PyTorch/  environment.yml
BHPD_PyTorch/  Misc/           fidle/
GTSRB/         README.ipynb    fidle_environment_linux.txt
IMDB/          README.md       fidle_environment_linux_gpu_cuda101.txt
IRIS/          SYNOP/          fidle_environment_windows10.txt
LinearReg/     Untitled.ipynb  fidle_environment_windows10_gpu_cuda101.txt
Example : utilisation d'un code python pour les réseaux de neurones
[homer@vision FIDLE]$ $SING_SHELL
Singularity> pwd
/zfs/home/ho/FIDLE
Singularity> export FIDLE_DATASETS_DIR=/zfs/home/ho/FIDLE/datasets
Singularity> jupyter-nbconvert --to python fidle/MNIST/02-CNN-MNIST.ipynb
[NbConvertApp] Converting notebook fidle/MNIST/02-CNN-MNIST.ipynb to python
[NbConvertApp] Writing 5044 bytes to fidle/MNIST/02-CNN-MNIST.py
Singularity> python 02-CNN-MNIST.py
2021-04-15 16:42:41.234381: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
<IPython.core.display.HTML object>
<IPython.core.display.Markdown object>
Version              : 2.0.23
Notebook id          : MNIST1
Run time             : Thursday 15 April 2021, 16:42:45
TensorFlow version   : 1.15.4
Keras version        : 2.2.4-tf
Datasets dir         : /zfs/home/gueguenm/FIDLE/datasets
Run dir              : ./run
Update keras cache   : False
x_train :  (60000, 28, 28, 1)
y_train :  (60000,)
x_test  :  (10000, 28, 28, 1)
y_test  :  (10000,)
Before normalization : Min=0, max=255
After normalization  : Min=0.0, max=1.0
Model: "sequential" 
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d (Conv2D)              (None, 26, 26, 8)         80
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 8)         0
_________________________________________________________________
dropout (Dropout)            (None, 13, 13, 8)         0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 16)        1168
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 16)          0
_________________________________________________________________
dropout_1 (Dropout)          (None, 5, 5, 16)          0
_________________________________________________________________
flatten (Flatten)            (None, 400)               0
_________________________________________________________________
dense (Dense)                (None, 100)               40100
_________________________________________________________________
dropout_2 (Dropout)          (None, 100)               0
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1010
=================================================================
Total params: 42,358
Trainable params: 42,358
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
WARNING:tensorflow:OMP_NUM_THREADS is no longer used by the default Keras config. 
...
Epoch 16/16
60000/60000 [==============================] - 0s 8us/sample - loss: 0.1122 - acc: 0.9661 - val_loss: 0.0404 - val_acc: 0.9862
Test loss     : 0.0404
Test accuracy : 0.9862

Mode batch : script runTensorflow

le script runTensorflow permet de charger l'environnement, créer le conteneur directement via PBSPro (voir runTensorflow -h pour l'aide)

runTensorflow -V sing_20.10-tf2-py3 -g 2 -i ./tensorflow/tf2-simple.py

-bash-4.2$ qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
5817.vision       S2M              gueguenm          02:46:30 R calcul_big
5825.vision       tensorflow_1h    axians                   0 R gpuq

Mode interactif en utilisant les notebooks Jupyter

Voir https://forge.univ-poitiers.fr/projects/calculateur-vision/wiki/Jupyter_(python_notebook).
Il est possible de charger directement les conteneurs Tensorflow.

runJupyter -V tensorflow_sing_20.10-tf2-py3 -q gpuq

Le notebook utilisé est celui du conteneur Singularity, mais il lit quand même le .conda/envs et donc ajoute au Notebook les différents kernels d’Anaconda.
Bien sur, dans le conteneur Singularity, il n’y a qu’un seul kernel qui fonctionne et c’est celui le plus à gauche (Python3), les autres ne fonctionneront pas puisque le module Anaconda3 n’est pas chargé.