From f9f2ae80d60989b5b2d5ab4f6841d0e11cc81f5a Mon Sep 17 00:00:00 2001 From: Kati Lassila-Perini Date: Sat, 19 Oct 2024 18:15:31 +0200 Subject: [PATCH] adding costs, cleaning --- episodes/01-intro.md | 4 +++- episodes/03-disk-image.md | 14 ++++++++++---- episodes/05-workflow.md | 2 +- index.md | 9 ++++----- 4 files changed, 18 insertions(+), 11 deletions(-) diff --git a/episodes/01-intro.md b/episodes/01-intro.md index dc612a1..7034b6c 100644 --- a/episodes/01-intro.md +++ b/episodes/01-intro.md @@ -94,7 +94,9 @@ When done, let's go! If you don't have access to a Linux terminal or prefer not to install tools locally, you can use Google Cloud Shell. You'll need a Google Cloud Platform (GCP) account and a GCP project. To open Cloud Shell, click the Cloud Shell icon in the top-right corner of the Google Cloud Console. -Cloud Shell comes pre-installed with gcloud, kubectl, terraform, and go. However, you'll need to install the Argo CLI manually +Cloud Shell comes pre-installed with gcloud, kubectl, terraform, and go. However, you'll need to install the Argo CLI manually. + +Remember that while using Cloud Shell is free, you will need to have credits (or pay) for the resources you deploy. :::::::::::::::::::::::::::::::::::::::::::::::::: diff --git a/episodes/03-disk-image.md b/episodes/03-disk-image.md index b0b88f9..ebc33ca 100644 --- a/episodes/03-disk-image.md +++ b/episodes/03-disk-image.md @@ -23,9 +23,7 @@ exercises: 5 The container image for the MiniAOD processing is very big and it needs to be pulled to the nodes of the cluster. It can take 30 mins to pull it. Making it available to the nodes of the cluster through a pre-built secondary disk image speeds up the workflow start. -A script is available but building the disk image does not work for a new GCP account for the moment. - -Investigating.... +We will follow the [instructions](https://cloud.google.com/kubernetes-engine/docs/how-to/data-container-image-preloading#prepare) provided by GCP to build the secondary disk image. All necessary steps are provided in this lesson. ## Prerequisites @@ -49,7 +47,9 @@ We create a bucket for these logs with gcloud storage buckets create gs:/// --location europe-west4 ``` -### Go installed +### Go installed? + +You should have `go` installed, see [Software setup](index.html#software-setup). ### Enabling services @@ -164,9 +164,15 @@ Note that the bucket for logs has to be in the same region so you might need to ## Costs +### Computing + +The script runs a Google Cloud Build process and there's per-minute small [cost](https://cloud.google.com/build/pricing). 120 minutes are included in the [Free tier services](https://cloud.google.com/free/docs/free-cloud-features#free-tier-usage-limits). +### Storage +The image is stored in Google Compute Engine image storage, and the cost is computed by the archive size of the image. The cost is very low: in our example case, the size is 12.25 GB, and the monthly cost is $0.05/GB. +There's a minimal cost for the output logs GCS bucket. The bucket can be deleted after the build. ::::::::::::::::::::::::::::::::::::: keypoints diff --git a/episodes/05-workflow.md b/episodes/05-workflow.md index 69336b2..10e65d9 100644 --- a/episodes/05-workflow.md +++ b/episodes/05-workflow.md @@ -44,7 +44,7 @@ List the buckets with gcloud storage ls ``` -### Argo CLI +### Argo CLI installed? You should have Argo CLI installed, see [Software setup](index.html#software-setup). diff --git a/index.md b/index.md index 247a0ce..cda5418 100644 --- a/index.md +++ b/index.md @@ -4,10 +4,9 @@ site: sandpaper::sandpaper_site This tutorial will guide you through setting up CMS Open data processing using Google Cloud Platform resources. -You will need this +This is for you if: -- if you would like to use the NanoAOD format in your data analysis -- if the standard NanoAOD content is not sufficient for your use-case -- if the missing information is available in MiniAOD. +- the standard NanoAOD content is not sufficient for your use-case +- you would like to use MiniAOD content for your analysis, but you do not have computing resources available. -The output of the example task is NanoAOD format enriched with the Particle-Flow (PF) candidates. Part of the existing open data is [available](https://opendata.cern.ch/search?q=&f=experiment%3ACMS&f=file_type%3Ananoaod-pf&l=list&order=desc&p=1&s=10&sort=mostrecent) in this format. +The example task is to process a custom NanoAOD starting from MiniAOD. The output of this example task is NanoAOD format enriched with the Particle-Flow (PF) candidates. Part of the existing open data is [available](https://opendata.cern.ch/search?q=&f=experiment%3ACMS&f=file_type%3Ananoaod-pf&l=list&order=desc&p=1&s=10&sort=mostrecent) in this format, and following this example you will be able to process more of it if needed in your analysis.