CAT services for GitLab CI
Overview
Teaching: 10 min
Exercises: 15 minQuestions
How can I more easily access CMS resources in GitLab CI?
Objectives
Demonstrate the use of the CAT EOS file service
Demonstrate the use of the CAT VOMS proxy service
The cms-analysis user code space
The Common Analysis Tools (CAT) group in CMS maintains a CERN GitLab area called cms-analysis, where anyone in CMS can store their analysis code. The area is documented in the CAT documentation pages. The area is organized in groups and subgroups, following the CMS Physics Coordination group structure. You can request the creation of an area in the PAG-specific group that best matches your analysis.
You can request an area for your analysis at any time
Bear in mind that it is always a good idea to keep you analysis code under version control. At any stage in your analysis you can request an area for your code. In fact we invite you to do so. The area can be created with a temporary name, which can then be changed to match the CADI line, when your analysis is mature enough to have one.
The services described here only work in cms-analysis
For security reasons, the services described in the following only work if your project is in the
cms-analysis
namespace. You can move the project you have been using so far in this lesson to thecms-analysis
namespace by going toSettings
–>General
–>Advanced
–>Transfer project
and selectcms-analysis / CMSDAS / CAT-tutorials
as a new namespace. Thecms-analysis / CMSDAS / CAT-tutorials
is to be used for the purpose of testing. You should in general select a target namespace in the relevant POG/PAG subgroups.
Using the CAT EOS file service
CAT has a service account, cmscat
, that is in the zh group and is a member of the CMS VO. CAT provides a service to request an EOS token in a GitLab CI job to be able to access CMS files on EOS on behalf of the cmscat
service account.
The service is described in more detail in here.
The files accessible through this method are hosted in /eos/cms/store/group/cat
.
The file you need is not there?
You can request more datasets to be stored in
/eos/cms/store/group/cat
by creating a MR to https://gitlab.cern.ch/cms-analysis/services/ci-dataset-files/-/blob/master/datasets.txt.
Exercise: setup a CI job the copies a file using the CAT EOS file service
There is a few technical aspects that are involved in this. First, your GitLab CI job needs to be configured to that it creates an authentication token. This is achieved with the following lines:
id_tokens: MY_JOB_JWT: aud: "cms-cat-ci-datasets.app.cern.ch"
Second, you need to query a service, hosted at
https://cms-cat-ci-datasets.app.cern.ch
, to give you a short lived token to access a file on EOS, on behalf of thecmscat
service account. This is achieved with the following lines:XrdSecsssENDORSEMENT=$(curl -H "Authorization: ${MY_JOB_JWT}" "https://cms-cat-ci-datasets.app.cern.ch/api?eospath=${EOSPATH}" | tr -d \")
Where
EOSPATH
is a variable holding a path of a file on EOS. Now you can access the file with a path that includes the newly generated token at the end, as:root://eoscms.cern.ch/${EOSPATH}?authz=${XrdSecsssENDORSEMENT}&xrd.wantprot=unix
. Try copying the file:/eos/cms/store/group/cat/datasets/MINIAODSIM/RunIISummer20UL17MiniAODv2-106X_mc2017_realistic_v9-v2/DYJetsToLL_M-50_TuneCP5_13TeV-amcatnloFXFX-pythia8/2C5565D7-ADE5-2C40-A0E5-BDFCCF40640E.root
Solution
A possible solution to the exercise above is the following:
test_eos_service: image: name: registry.cern.ch/docker.io/cmssw/el7:x86_64 tags: - cvmfs id_tokens: MY_JOB_JWT: # or any other variable name aud: "cms-cat-ci-datasets.app.cern.ch" variables: # File is taken from https://cms-cat-ci-datasets.web.cern.ch/ EOSPATH: '/eos/cms/store/group/cat/datasets/MINIAODSIM/RunIISummer20UL17MiniAODv2-106X_mc2017_realistic_v9-v2/DYJetsToLL_M-50_TuneCP5_13TeV-amcatnloFXFX-pythia8/2C5565D7-ADE5-2C40-A0E5-BDFCCF40640E.root' EOS_MGM_URL: root://eoscms.cern.ch before_script: - 'XrdSecsssENDORSEMENT=$(curl -H "Authorization: ${MY_JOB_JWT}" "https://cms-cat-ci-datasets.app.cern.ch/api?eospath=${EOSPATH}" | tr -d \")' script: - xrdcp "${EOS_MGM_URL}/${EOSPATH}?authz=${XrdSecsssENDORSEMENT}&xrd.wantprot=unix" test.root - ls -l test.root
Using the CAT VOMS proxy service
The cmscat
service account is also a member of the CMS VO, so it can request a VOMS proxy.
If your project is in cms-analysis
it can request a VOMS proxy from a service hosted at cms-cat-grid-proxy-service.app.cern.ch
, in much the same way as the CAT EOS service requests a proxy to cms-cat-ci-datasets.app.cern.ch
above.
The VOMS proxy is provided as a base64
-encoded string, and it has a lifetime as long as the CI job that requests it.
Exercise: Set up a CI job that obtains a VOMS proxy
There are a few technical aspects that involved in this. First, your GitLab CI job needs to be configured to that it creates an authentication token. This is achieved with the following lines:
id_tokens: MY_JOB_JWT: aud: "cms-cat-grid-proxy-service.app.cern.ch"
Second, you need to query a service, hosted at
https://cms-cat-grid-proxy-service.app.cern.ch
, to give you a short-lived VOMS proxy, on behalf of thecmscat
service account. This is achieved with the following lines:proxy=$(curl --fail-with-body -H "Authorization: ${MY_JOB_JWT}" "https://cms-cat-grid-proxy-service.app.cern.ch/api" | tr -d \")
Finally, you need to decode the proxy, store it as a file, and set the
X509_USER_PROXY
environment variable using something like:- printf $proxy | base64 -d > myproxy - export X509_USER_PROXY=$(pwd)/myproxy
Warning
The image you use needs to have CVMFS mounted. Depending on how the environment of the image you use is set, you may also need to export a few other environment variables, in particular:
- export X509_VOMS_DIR=/cvmfs/grid.cern.ch/etc/grid-security/vomsdir/ - export VOMS_USERCONF=/cvmfs/grid.cern.ch/etc/grid-security/vomses/ - export X509_CERT_DIR=/cvmfs/grid.cern.ch/etc/grid-security/certificates/
Solution
A possible solution to the exercise above is the following:
test_proxy_service: image: name: registry.cern.ch/docker.io/cmssw/el7:x86_64 tags: - cvmfs id_tokens: MY_JOB_JWT: # or any other variable name aud: "cms-cat-grid-proxy-service.app.cern.ch" before_script: - 'proxy=$(curl -H "Authorization: ${MY_JOB_JWT}" "https://cms-cat-grid-proxy-service.app.cern.ch/api" | tr -d \")' script: - printf $proxy | base64 -d > myproxy - export X509_USER_PROXY=$(pwd)/myproxy - export X509_CERT_DIR=/cvmfs/grid.cern.ch/etc/grid-security/certificates/ - voms-proxy-info # to test it
Key Points
To use CAT services your project needs to reside into cms-analysis.
You won’t need to expose any personal credentials.
It is easy to host your analysis code in cms-analysis.