This lesson is in the early stages of development (Alpha version)

Reproducible analyses: Glossary

Key Points

Introduction	Workflow is the new data. Data + Code + Environment + Workflow = Reproducible Analyses Before reproducibility comes preproducibility
First example	Use `reana-client` rich command-line client to run containerised workflows from your laptop on remote compute clouds Before running analysis remotely, check locally its correctness via `validate` command As always, when it doubt, use the `--help` command-line argument
Developing serial workflows	Develop workflows progressively; add steps as needed When developing a workflow, stay on the same workspace When developing a bytecode-interpreted code, stay on the same container Use smaller test data before scaling out Use workflows as Continuous Integration; make atomic commits that always work
HiggsToTauTau analysis: serial	Writing serial workflows is like chaining shell script commands
Coffee break	Refresh your mind Discuss your experience
Developing parallel workflows	Computational analysis is a graph of inter-dependent steps Fully declare inputs and outputs for each step Use Scatter/Gather or Map/Reduce to avoid copy-paste coding
HiggsToTauTau analysis: parallel	Use step dependencies to express main analysis stages Use scatter-gather paradigm in staged to massively parallelise DAG workflow execution REANA usage scenarios remain the same regardless of workflow language details
A glimpse on advanced topics	Workflow specification uses hints to hide implementation complexity Use `kerberos: true` clause to automatically trigger Kerberos token initalisation Use `resources` clause to access CVMFS repositories Use `compute_backend` hint in your workflow steps to dispatch jobs to various HPC/HTC backends Use `open/close` commands to open and close interactive sessions on your workspace Enable REANA application on GitLab to run long-standing tasks that would time out in GitLab CI
Wrap-up	Experiment with containerised workflows to advance scientific reproducibility in your research

Glossary

reproducible analysis

computational workflows