This lesson is in the early stages of development (Alpha version)

Reproducible analyses

Lesson on reproducible analyses and reusable containerised scientific workflows


Setup Install software required for the lesson
00:00 1. Introduction What makes research data analyses reproducible?
Is preserving code, data, and containers enough?
00:11 2. First example How to run analyses on REANA cloud?
What are the basic REANA command-line client usage scenarios?
How to monitor my analysis using REANA web interface?
00:31 3. Developing serial workflows How to write serial workflows?
What is declarative programming?
How to develop workflows progressively?
Can I temporarily override workflow parameters?
Do I always have to build new Docker image when my code changes?
01:01 4. HiggsToTauTau analysis: serial Challenge: write the HiggsToTauTau analysis serial workflow and run it on REANA
01:26 5. Coffee break Coffee break
01:41 6. Developing parallel workflows How to scale up and run thousands of jobs?
What is a DAG?
What is a Scatter-Gather paradigm?
How to run Yadage workflows on REANA?
02:06 7. HiggsToTauTau analysis: parallel Challenge: write the HiggsToTauTau analysis parallel workflow and run it on REANA
02:36 8. A glimpse on advanced topics Can I publish workflow results on EOS?
Can I use Kerberos to access restricted resources?
Can I use CVMFS software repositeries?
Can I dispatch heavy computations to HTCondor?
Can I dispatch heavy computations to Slurm?
Can I open Jupyter notebooks on my REANA workspace?
Can I connect my GitLab repositories with REANA?
02:56 9. Wrap-up What have we learned today?
Where to go from here?
03:01 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.