05: add workflow

cms-opendata-workshop · Oct 16, 2024 · c7c1182 · c7c1182
1 parent 8a45951
commit c7c1182
Show file tree

Hide file tree

Showing 5 changed files with 235 additions and 6 deletions.
diff --git a/config.yaml b/config.yaml
@@ -65,8 +65,9 @@ contact: '[email protected]' # FIXME
 episodes: 
 - 01-intro.md
 - 02-storage.md
-- 03-disk-image-manual.md
+- 03-disk-image.md
 - 04-cluster.md
+- 05-workflow.md
 
 # Information for Learners
 learners: 

diff --git a/episodes/03-disk-image-manual.md b/episodes/03-disk-image-manual.md
@@ -209,7 +209,14 @@ gcloud compute instances delete vm-work --zone=europe-west4-c
 gcloud compute disks delete pfnano-disk --zone=europe-west4-c
 ```
 
+## Nota bene
 
+The workflow does not find the container images in this secondary disk on the node.
+A visible difference between the images that the "family" parameter is `secondary-disk-image` for the one built by the disk.
+
+TBC if setting it so helps.
+
+But it might be more involved...
 
 ## Costs
 

diff --git a/episodes/03-disk-image.md b/episodes/03-disk-image.md
@@ -146,7 +146,9 @@ Message: Quota 'N2_CPUS' exceeded.
 
 are due to requested machine type no being available in the requested zone. Nothing to do with you quota.
 
-Try in a different region or with a different machine type. You can give them as parameters  e.g. `--zone=europe-west4-a --machine-type=e2-standard-4`
+Try in a different region or with a different machine type. You can give them as parameters  e.g. `--zone=europe-west4-a --machine-type=e2-standard-4`.
+Independent of the zone specified in parameters, the disk image will have `eu` as the location, so any zone in `europe` is OK (if you plan to create your cluster in a zone in `europe`).
+
 
 Note that the bucket for logs has to be in the same region so you might need to create another one. Remove the old one with `gcloud storage rm -r gs://<BUCKET_FOR_LOGS>`.
 

diff --git a/episodes/04-cluster.md b/episodes/04-cluster.md
@@ -37,12 +37,72 @@ The output shows your account and project.
 
 Before you can create resources on GCP, you will need to enable them
 
-If this is your first project or you created it from the Google Cloud Console Web UI, it will have several services enabled.
+In addition to what was enabled in the previous section, we will now enable Kubernetes Engine API (container.googleapis.com):
+
+```bash
+gcloud services enable container.googleapis.com
+```
+
+### Bucket
+
+If you worked through Section 02, you have now a storage bucket for the output files.
+
+List the buckets with
+
+```bash
+gcloud storage ls
+```
+
+### Secondary disk
+
+If you worked through Section 03, you have a secondary boot disk image available
 
 ## Get the code
 
+The example Terraform scripts and Argo Workflow configuration are in 
+
+Get them with
+
+```bash
+git clone [email protected]:cms-dpoa/cloud-processing.git
+cd cloud-processing/standard-gke-cluster-gcs-imgdisk
+```
+
 ## Create the cluster
 
+Set the variable in the `terraform.tfvars` files.
+
+Run 
+
+```bash
+terraform apply
+```
+
+and confirm "yes".
+
+## Connect to the cluster and inspect
+
+```bash
+gcloud container clusters get-credentials cluster-2 --region europe-we
+st4-a --project hip-new-full-account
+```
+
+```bash
+kubectl get nodes
+```
+
+```bash
+kubectl get ns
+```
+
+## Enable image streaming
+
+```bash
+ gcloud container clusters update cluster-2 --zone europe-west4-a --ena
+ble-image-streaming
+
+```
+
 
 ## Costs
 
@@ -53,9 +113,8 @@ If this is your first project or you created it from the Google Cloud Console We
 
 ::::::::::::::::::::::::::::::::::::: keypoints 
 
-- Google Cloud Storage bucket can be used to store the output files.
-- The storage cost depends on the volume stored and for this type of processing is very small. 
-- The download of the output files for the bucket has a signicant cost.
+- Kubernetes clusters can be created with Terraform scripts.
+- kubectl is the tool to interact with the cluster.
 
 
 ::::::::::::::::::::::::::::::::::::::::::::::::

diff --git a/episodes/05-workflow.md b/episodes/05-workflow.md
@@ -0,0 +1,160 @@
+---
+title: "Set up workflow"
+teaching: 10
+exercises: 5 
+---
+
+:::::::::::::::::::::::::::::::::::::: questions 
+
+- How to set up Argo Workflow engine?
+- How to submit a test job?
+- Where to find the output?
+
+::::::::::::::::::::::::::::::::::::::::::::::::
+
+::::::::::::::::::::::::::::::::::::: objectives
+
+- Deploy Argo Workflows services to the cluster.
+- Submit a test job.
+- Find the output in your bucket.
+
+::::::::::::::::::::::::::::::::::::::::::::::::
+
+
+## Prerequisites
+
+
+### GCP account and project
+
+Make sure that you are in the GCP account and project that you intend to use for this work. In your Linux terminal, type
+
+```bash
+gcloud config list
+```
+
+The output shows your account and project. 
+
+### Bucket
+
+If you worked through [Section 02](episodes/02-storage), you have now a storage bucket for the output files.
+
+List the buckets with
+
+```bash
+gcloud storage ls
+```
+
+### Argo CLI
+
+You should have Argo CLI installed, see [Software setup](index.html#software-setup).
+
+
+## Get the code
+
+The example Terraform scripts and Argo Workflow configuration are in 
+
+Get them with
+
+```bash
+git clone [email protected]:cms-dpoa/cloud-processing.git
+cd cloud-processing/standard-gke-cluster-gcs-imgdisk
+```
+
+## Deploy Argo Workflows service
+
+Deploy Argo Workflows services with
+
+```bash
+kubectl apply -n argo  -f https://github.com/argoproj/argo-workflows/releases/download/v3.5.10/install.yaml
+kubectl apply -f argo/service_account.yaml
+kubectl apply -f argo/argo_role.yaml
+kubectl apply -f argo/argo_role_binding.yaml
+```
+
+Wait for the services to start. 
+
+You should see the following:
+
+```bash
+$ kubectl get all -n argo
+NAME                                       READY   STATUS    RESTARTS   AGE
+pod/argo-server-5f7b589d6f-jkf4z           1/1     Running   0          24s
+pod/workflow-controller-864c88655d-wsfr8   1/1     Running   0          24s
+
+NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
+service/argo-server   ClusterIP   34.118.233.69   <none>        2746/TCP   25s
+
+NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
+deployment.apps/argo-server           1/1     1            1           24s
+deployment.apps/workflow-controller   1/1     1            1           24s
+
+NAME                                             DESIRED   CURRENT   READY   AGE
+replicaset.apps/argo-server-5f7b589d6f           1         1         1       24s
+replicaset.apps/workflow-controller-864c88655d   1         1         1       24s
+```
+
+## Submit a test job
+
+Edit the parameters in the `argo/argo_bucket_run.yaml` so that they are
+
+```
+    parameters:
+    - name: nEvents
+      #FIXME
+      # Number of events in the dataset to be processed (-1 is all)
+      value: 1000
+    - name: recid
+      #FIXME
+      # Record id of the dataset to be processed
+      value: 30511
+    - name: nJobs
+      #FIXME
+      # Number of jobs the processing workflow should be split into
+      value: 2
+    - name: bucket
+      #FIXME
+      # Name of cloud storage bucket for storing outputs
+      value: <YOUR_BUCKET_NAME>
+```
+
+Now submit the workflow with
+
+```bash
+argo submit -n argo argo/argo_bucket_run.yaml
+```
+
+Observe its progress with
+
+```bash
+argo get -n argo @latest
+```
+
+Once done, check the ouput in the bucket with
+
+```bash
+$ gcloud storage ls gs://<YOUR_BUCKET_NAME>/**
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/files_30511.txt
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/logs/1.logs
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/logs/2.logs
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/plots/h_num_cands.png
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/plots/h_pdgid_cands.png
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/scatter/pfnanooutput1.root
+gs://<YOUR_BUCKET_NAME>/pfnano/30511/scatter/pfnanooutput2.root
+```
+
+## Costs
+
+
+
+
+
+
+::::::::::::::::::::::::::::::::::::: keypoints 
+
+- Once the cluster is up, you will first deploy the Argo Workflows services using `kubectl`.
+- You will submit and monitor the workflow with `argo`.
+- You can see the output in the bucket with `gcloud` commands or on Google Cloud Console Web UI.
+
+
+::::::::::::::::::::::::::::::::::::::::::::::::
+