First example
Overview
Teaching: 15 min
Exercises: 5 minQuestions
How to run analyses on REANA cloud?
What are the basic REANA command-line client usage scenarios?
How to monitor my analysis using REANA web interface?
Objectives
Get hands-on experience with REANA command-line client
Overview
In this lesson we shall run our first simple REANA example. We shall see:
- structure of the example analysis and associated
reana.yaml
file - how to install of REANA command-line client
- how to connect REANA client to remote REANA cluster
- how to run analysis on remote REANA cluster
Checklist
Have you installed
reana-client
and/or have you logged into LXPLUS as described in Setup?
First REANA example
We shall get acquainted with REANA by means of running a sample analysis example:
Let’s start by cloning it:
$ git clone https://github.com/reanahub/reana-demo-root6-roofit
$ cd reana-demo-root6-roofit
What does the example do? The example emulates a typical particle physics analysis where the signal and background data is processed and fitted against a model. The example will use the RooFit package of the ROOT framework.
Four questions:
- Input data? None. We’ll simulate them.
- Analysis code? Two files:
gendata.C
macro generates signal and background data;fitdata.C
macro makes a fit for the signal and the background data. - Compute environment? ROOT with RooFit.
- Runtime procedures? Simple serial workflow: first run gendata, then run fitdata.
Workflow definition:
START
|
|
V
+-------------------------+
| (1) generate data |
| |
| $ root gendata.C ... |
+-------------------------+
|
| data.root
V
+-------------------------+
| (2) fit data |
| |
| $ root fitdata.C ... |
+-------------------------+
|
| plot.png
V
STOP
The four questions expressed in reana.yaml
fully define our analysis:
version: 0.6.0
inputs:
files:
- code/gendata.C
- code/fitdata.C
parameters:
events: 20000
data: results/data.root
plot: results/plot.png
workflow:
type: serial
specification:
steps:
- name: gendata
environment: 'reanahub/reana-env-root6:6.18.04'
commands:
- mkdir -p results && root -b -q 'code/gendata.C(${events},"${data}")'
- name: fitdata
environment: 'reanahub/reana-env-root6:6.18.04'
commands:
- root -b -q 'code/fitdata.C("${data}","${plot}")'
outputs:
files:
- results/plot.png
Note the basic structure of reana.yaml
answering the Four Questions. (input data? analysis code?
compute environment? workflow steps?)
Install REANA command-line client
First we need to make sure we can use REANA command-line client. Option 1: use locally on your laptop. Option 2: use preinstalled on LXPLUS. See setup instructions.
The client will offer several commands which we shall go through in this tutorial:
$ reana-client --help
Usage: reana-client [OPTIONS] COMMAND [ARGS]...
REANA client for interacting with REANA server.
Options:
-l, --loglevel [DEBUG|INFO|WARNING]
Sets log level
--help Show this message and exit.
Configuration commands:
ping Check connection to REANA server.
version Show version.
Workflow management commands:
create Create a new workflow.
delete Delete a workflow.
diff Show diff between two workflows.
list List all workflows and sessions.
Workflow execution commands:
logs Get workflow logs.
restart Restart previously run workflow.
run Shortcut to create, upload, start a new workflow.
start Start previously created workflow.
status Get status of a workflow.
stop Stop a running workflow.
validate Validate workflow specification file.
Workspace interactive commands:
close Close an interactive session.
open Open an interactive session inside the workspace.
Workspace file management commands:
download Download workspace files.
du Get workspace disk usage.
ls List workspace files.
mv Move files within workspace.
rm Delete files from workspace.
upload Upload files and directories to workspace.
Secret management commands:
secrets-add Add secrets from literal string or from file.
secrets-delete Delete user secrets by name.
secrets-list List user secrets.
You can use --help
option to learn more about any command, for example validate
:
$ reana-client validate --help
Usage: reana-client validate [OPTIONS]
Validate workflow specification file.
The `validate` command allows to check syntax and validate the reana.yaml
workflow specification file.
Examples:
$ reana-client validate -f reana.yaml
Options:
-f, --file PATH REANA specifications file describing the workflow and
context which REANA should execute.
--help Show this message and exit.
Exercise
Validate our
reana.yaml
file to discover any errors. Usevalidate
command to do so.
Solution
$ reana-client validate -f ./reana.yaml
File reana-demo-root6-roofit/reana.yaml is a valid REANA specification file.
Connect REANA client to remote REANA cluster
The REANA client will interact with a remote REANA cluster.
The REANA client knows to which REANA cluster it connects by means of two environment variables:
$ export REANA_SERVER_URL=https://reana.cern.ch
$ export REANA_ACCESS_TOKEN=xxxxxx
The REANA client connection to remote REANA cluster can be verified via ping
command:
$ reana-client ping
Connected to https://reana.cern.ch - Server is running.
The authentication uses a token that one can get by logging into REANA UI at reana.cern.ch.
Exercise
Get REANA user token and connect to REANA cluster at
reana.cern.ch
.
Solution
Login to reana.cern.ch, copy your token, and use the commands above.
Run example on REANA cluster
Now that we have defined and validated our reana.yaml
, and connected to the REANA production
cluster, we can run the example easily via:
$ reana-client run -w roofit
[INFO] Creating a workflow...
roofit.1
[INFO] Uploading files...
File code/gendata.C was successfully uploaded.
File code/fitdata.C was successfully uploaded.
[INFO] Starting workflow...
roofit.1 is running
Here, we use run
command that will create a new workflow named roofit
, upload its inputs as
specified in the workflow specification and finally start the workflow.
While the workflow is running, we can enquire about its status:
$ reana-client status -w roofit
NAME RUN_NUMBER CREATED STARTED STATUS PROGRESS
roofit 1 2020-02-17T16:01:45 2020-02-17T16:01:48 running 1/2
After a minute, the workflow should finish and we can list the output files in the remote workspace:
$ reana-client ls -w roofit
NAME SIZE LAST-MODIFIED
code/gendata.C 1937 2020-02-17T16:01:46
code/fitdata.C 1648 2020-02-17T16:01:47
results/plot.png 15450 2020-02-17T16:02:44
results/data.root 154457 2020-02-17T16:02:17
We can also get the logs:
$ reana-client logs -w roofit | less
==> Workflow engine logs
2020-02-17 16:02:10,859 | root | MainThread | INFO | Publishing step:0, cmd: mkdir -p results && root -b -q 'code/gendata.C(20000,"results/data.root")', total steps 2 to MQ
2020-02-17 16:02:23,002 | root | MainThread | INFO | Publishing step:1, cmd: root -b -q 'code/fitdata.C("results/data.root","results/plot.png")', total steps 2 to MQ
2020-02-17 16:02:50,093 | root | MainThread | INFO | Workflow 424bc949-b809-4782-ba96-bc8cfa3e1a89 finished. Files available at /var/reana/users/b57e902f-fd11-4681-8a94-4318ae05d2ca/workflows/424bc949-b809-4782-ba96-bc8cfa3e1a89.
==> Job logs
==> Step: gendata
==> Workflow ID: 424bc949-b809-4782-ba96-bc8cfa3e1a89
==> Compute backend: Kubernetes
==> Job ID: 53c97429-25e9-4b74-94f7-c665d93fdbc2
==> Docker image: reanahub/reana-env-root6:6.18.04
==> Command: mkdir -p results && root -b -q 'code/gendata.C(20000,"results/data.root")'
==> Status: finished
==> Logs:
...
We can download the resulting plot:
$ reana-client download results/plot.png -w roofit
$ display results/plot.png
Exercise
Run the example workflow on REANA cluster. Practice
status
,ls
,logs
,download
commands. For example, can you get the logs of the gendata step only?
Solution
$ reana-client logs -w roofit --step gendata
Key Points
Use
reana-client
rich command-line client to run containerised workflows from your laptop on remote compute cloudsBefore running analysis remotely, check locally its correctness via
validate
commandAs always, when it doubt, use the
--help
command-line argument