Interactive access to GPUs: Difference between revisions

From LHEP Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 14: Line 14:
   srun --partition=CLUSTER-GPU --gres=gpu:1 -t 100 --pty bash
   srun --partition=CLUSTER-GPU --gres=gpu:1 -t 100 --pty bash


This will land the user on the worker node 'wn-1-1', which is the special wn we have that is equipped with CPUs. The '-t' flag reserves a runtime of 100 minutes, 2GB or RAM are allocated by default. In order to tweak your user request for resources (e.g. '--mem-per-cpu='), please read the 'srun' docs: [https://slurm.schedmd.com/srun.html]
This will land the user on the worker node '''wn-1-1''', which is the special wn we have that is equipped with CPUs. The '-t' flag reserves a runtime of 100 minutes, 2GB or RAM are allocated by default. In order to tweak your user request for resources (e.g. '--mem-per-cpu='), please read the 'srun' docs: [https://slurm.schedmd.com/srun.html]


There is a local storage area of several hundred gygabytes. For interactive work, inputs, code, containers, etc should be copied over to this area first. We encourage to copy outputs out after every run to free up space on the local area. More complex data management schemes are possible, should be discussed according to the user(s) needs
There is a local storage area of several hundred gygabytes. For interactive work, inputs, code, containers, etc should be copied over to this area first. We encourage to copy outputs out after every run to free up space on the local area. More complex data management schemes are possible, should be discussed according to the user(s) needs

Revision as of 13:09, 2 June 2023

NOTE

Due to lack of demand, this procedure has been established for ad-hoc users for limited time frames. As such, it is not very polished and might need adjustments. We will improve it in case higher demand arises. For the time being, only one user on the cluster is mapped to incoming ssh requests. The underlying resources are managed by Slurm, so the user will interact via Slurm client commands, e.g. 'srun'. Work in progress ...

Prerequisites

Users wishing to use special resources like GPUs should follow they following steps"

  • Provide their ssh public key

NOTE: the key MUST be protected by a passphrase. We will proactively remove any key that is not.

  • System access
  ssh atlasch020@ce01.lhep.unibe.ch
  • Start an interactive shell
  srun --partition=CLUSTER-GPU --gres=gpu:1 -t 100 --pty bash

This will land the user on the worker node wn-1-1, which is the special wn we have that is equipped with CPUs. The '-t' flag reserves a runtime of 100 minutes, 2GB or RAM are allocated by default. In order to tweak your user request for resources (e.g. '--mem-per-cpu='), please read the 'srun' docs: [1]

There is a local storage area of several hundred gygabytes. For interactive work, inputs, code, containers, etc should be copied over to this area first. We encourage to copy outputs out after every run to free up space on the local area. More complex data management schemes are possible, should be discussed according to the user(s) needs

  • Run your code

At this stage, you can run your code/container interactively, the environmens should be very much like any UI, and should be able to make use of one or more GPUs