Diku cluster
DIKU RootPainter Setup¶
This is is a guide for getting started with the RootPainter software using the diku servers and has been tested using python 3.7. RootPainter is described in this paper.
- You will need to be able to ssh into slurm. If you don't have one already then add an entry for slurm in your ssh config file. Add the following to ~/.ssh/config and replace KUID with your own KU ID.
Host slurm
Hostname a00552
User KUID
ProxyCommand ssh -q -W %h:%p KUID@ssh-diku-image.science.ku.dk
There is more information about working with slurm in the Slurm Wiki.
- SSH into the server to set up the server component of RootPainter.
ssh slurm
- Clone the RootPainter code from the repository and then cd into the trainer directory (the server component).
git clone --branch 0.2.4 https://github.com/Abe404/root_painter.git
cd root_painter/trainer
- To avoid alterating global packages. I suggest using a virtual environment. Create a virtual environment and activate it.
python -m venv env
source ./env/bin/activate
- Install dependencies in the virtual environment. (takes over 3 minutes)
pip install torch==1.3.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
- Run root painter to create the sync directory.
python main.py
You will be prompted to input a location for the sync directory. This is the folder where files are shared between the client and server. I will use ~/root_painter_sync. RootPainter will then create some folders inside ~/root_painter_sync
- Create a slurm job. Create a file named job.sh and insert the following. Modify the details based on your preferences.
#!/bin/bash
# normal cpu stuff: allocate cpus, memory
#SBATCH --ntasks=1 --cpus-per-task=12 --mem=20000M
# we run on the gpu partition and we allocate 1 titanrtx gpu
#SBATCH -p gpu --gres=gpu:titanrtx:1
#We expect that our program should not run langer than 3 hours
#Note that a program will be killed once it exceeds this time!
#SBATCH --time=3:00:00
#your script, in this case: write the hostname and the ids of the chosen gpus.
hostname
echo $CUDA_VISIBLE_DEVICES
python main.py
- Run the slurm job.
sbatch job.sh
- To mount the sync directory from your local machine you will need to install sshfs locally (SSH Filesystem client).
Debian / Ubuntu:
sudo apt-get install sshfs
OSX:
brew cask install osxfuse
Windows: See sshfs-win
- Create the directory and mount the drive locally using sshfs. Replace KUID with your own KU ID.
mkdir ~/Desktop/root_painter_sync
sshfs -o allow_other,default_permissions KUID@slurm:/home/KUID/root_painter_sync ~/Desktop/root_painter_sync
You should now be able to see the folders created by RootPainter (datasets, instructions and projects) inside ~/Desktop/root_painter_sync on your local machine See lung tutorial for an example of how to use RootPainter to train a model.