Skip to content

Commit

Permalink
markdown source builds
Browse files Browse the repository at this point in the history
Auto-generated via {sandpaper}
Source  : e889ae7
Branch  : main
Author  : Kati Lassila-Perini <[email protected]>
Time    : 2024-07-23 09:32:26 +0000
Message : Merge pull request #1 from cms-opendata-workshop/tpm

Main structure of lesson with some content
  • Loading branch information
actions-user committed Jul 23, 2024
1 parent 662077e commit 550f987
Show file tree
Hide file tree
Showing 11 changed files with 257 additions and 135 deletions.
34 changes: 34 additions & 0 deletions 02-nanoaod-miniaod.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: "Differences between NanoAOD and MiniAOD"
teaching: 10
exercises: 0
---

:::::::::::::::::::::::::::::::::::::: questions

- What have we learned in the pre-exercises and how can we apply it?
- What is the structure and content of the nanoAOD format?
- How is it different from miniAOD?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Apply what we have learned in the pre-exercises
- Learn about the structure and content of nanoAOD and how it differs from miniAOD

::::::::::::::::::::::::::::::::::::::::::::::::

## Dataformats in CMS


::::::::::::::::::::::::::::::::::::: keypoints

- Use `.md` files for episodes when you want static content
- Use `.Rmd` files for episodes when you need to generate output
- Run `sandpaper::check_lesson()` to identify any issues with your lesson
- Run `sandpaper::build_lesson()` to preview your lesson locally

::::::::::::::::::::::::::::::::::::::::::::::::

[r-markdown]: https://rmarkdown.rstudio.com/
80 changes: 80 additions & 0 deletions 03-nanoaod-dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
title: "NanoAOD datasets"
teaching: 10
exercises: 0
---

:::::::::::::::::::::::::::::::::::::: questions

- How do we find a specific nanoAOD dataset?
- How to we explore the content of our nanoAOD dataset?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Know how to find nanoAOD datasets
- Know how to explore the content of nanoAOD

::::::::::::::::::::::::::::::::::::::::::::::::

## Find and explore a nanoAOD dataset

Let's find and explore a particular which we will get even further into
later: simulated Z' events in which the Z' decays to a top and antitop quark pair.

:::::: callout
A Z' ("Z-prime") is a hypothetical heavy gauge boson that could come from
extensions of the Standard Model. A review of searches for the Z'
can be found [here](https://pdg.lbl.gov/2024/reviews/rpp2024-rev-zprime-searches.pdf)
::::::

### Find the dataset

All data can be found via the [CERN Open Data Portal](https://opendata.cern.ch).
Let's go to the website and search the simulated Z' datasets.

Dataset naming in CMS can seem obscure but let's do something simple and search for "Zprime*":

![](fig/ZprimeODP.png){alt='Search for Zprime* at the CODP'}

The query results are [here](https://opendata.cern.ch/search?q=Zprime%2A&l=list&order=asc&p=1&s=10&sort=bestmatch) and you can see that there are many (over 1000) records returned:

![](fig/ZprimeODP-results.png){alt='Search results for Zprime*'}

Let's narrow down the results and select "Type: Dataset", "Experiment: CMS", "Year: 2016", "File type: nanoaodsim", and "Category: Heavy Gauge Bosons". We've now reduced the number of [matches](https://opendata.cern.ch/search?q=Zprime%2A&f=experiment%3ACMS&f=year%3A2016&f=file_type%3Ananoaodsim&f=category%3AExotica%2Bsubcategory%3AHeavy%20Gauge%20Bosons&f=type%3ADataset&l=list&order=asc&p=1&s=10&sort=bestmatch) from over 1000 down to 210:

![](fig/ZprimeODP-results2.png){alt='Narrowed search results for Zprime*'}

We can discern some of the logic behind the simulated dataset naming. "Zprime" is the particle produced and it decays to various products. We want $Z^{'} \rightarrow t\bar{t}$ which shows up as the third result so let's [narrow the search](https://opendata.cern.ch/search?q=ZprimeToTT%2A&f=experiment%3ACMS&f=year%3A2016&f=file_type%3Ananoaodsim&f=category%3AExotica%2Bsubcategory%3AHeavy%20Gauge%20Bosons&f=type%3ADataset&l=list&order=asc&p=1&s=10&sort=bestmatch) further and search with "ZprimeToTT*":

![](fig/ZprimeToTT-results.png){alt='Narrowed search results for Zprime*'}

We can also discern that the dataset names also include the mass (in GeV) of the hypothetical Z' (e.g. "_M2000").

TO-DO: what do the other strings mean in the dataset name?

Possible challenge: have them select a mass and search for the dataset and select a file for the next part.

Next, let's use the `cernopendata-client` command-line tool to find the datasets
and fetch a file.

### Explore a file

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor

Inline instructor notes can help inform instructors of timing challenges
associated with the lessons. They appear in the "Instructor View"

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: keypoints

- Use `.md` files for episodes when you want static content
- Use `.Rmd` files for episodes when you need to generate output
- Run `sandpaper::check_lesson()` to identify any issues with your lesson
- Run `sandpaper::build_lesson()` to preview your lesson locally

::::::::::::::::::::::::::::::::::::::::::::::::

[r-markdown]: https://rmarkdown.rstudio.com/
107 changes: 107 additions & 0 deletions 04-nanoaod-exercises.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: "NanoAOD exercises"
teaching: 10
exercises: 0
---

:::::::::::::::::::::::::::::::::::::: questions

- What have we learned in the pre-exercises and how can we apply it?
- What is the structure and content of the nanoAOD format?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Apply what we have learned in the pre-exercises
- Learn about the structure and content of nanoAOD

::::::::::::::::::::::::::::::::::::::::::::::::

## Exercises with NanoAOD

::::::::::::::::::::::::::::::::::::: challenge

## Exercise 1: Get data file locations

Let's select a ZprimeToTT sample for a given mass and using
the `cernopendata-client` to get the associated data files

Recall what you've learned from the [pre-exercise](https://cms-opendata-workshop.github.io/workshop2024-lesson-dataset-scouting/instructor/04-cli-through-cernopendata-client.html) on the `cernopendata-client`.

:::::::::::::::::::::::: solution

## Solution

Search for the ZprimeToTT samples in the CERN Open Data Portal. The resulting query is [here](https://opendata.cern.ch/search?q=ZprimeToTT%2A&f=experiment%3ACMS&f=year%3A2016&f=file_type%3Ananoaodsim&f=category%3AExotica%2Bsubcategory%3AHeavy%20Gauge%20Bosons&f=type%3ADataset&l=list&order=asc&p=1&s=10&sort=bestmatch).

Next, select a dataset. Here we fetch [this one](https://opendata.cern.ch/record/75124), record 75124, "Simulated dataset ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8 in NANOAODSIM format for 2016 collision data" where the Z' mass is 1000 GeV.

Fetch the Docker image for the `cernopendata-client`:

```bash
docker pull docker.io/cernopendata/cernopendata-client
```

and refresh your memory on the commands:
```bash
docker run -i -t --rm docker.io/cernopendata/cernopendata-client --help
```
```output
Usage: cernopendata-client [OPTIONS] COMMAND [ARGS]...
Command-line client for interacting with CERN Open Data portal.
Options:
--help Show this message and exit.
Commands:
download-files Download data files belonging to a record.
get-file-locations Get a list of data file locations of a record.
get-metadata Get metadata content of a record.
list-directory List contents of a EOSPUBLIC Open Data directory.
verify-files Verify downloaded data file integrity.
version Return cernopendata-client version.
```

Then fetch the files for record 75124:
```python
cernopendata-client get-file-locations --recid 75124
```

```output
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/2520000/65A0736B-22F3-C94C-99AE-36717B28629C.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/2520000/6E508763-A12F-8846-A295-F39EE7DDAA52.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/2530000/7B2D5CD5-9CAE-C046-A9AB-50CE9D48B187.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/1A50245D-8213-6340-8EA0-CB064EEC6AF3.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/270000/09FA6C37-21D6-7846-B3E1-F8086CBA0E9E.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/270000/3AAB5B1E-7169-9C4D-841C-CB2D6E40CBAE.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/270000/820C3EBC-0E1D-CE41-9418-FA1615123FC2.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/270000/CF54D079-349C-FB4F-B6E5-3D579D89EDE4.root
http://opendata.cern.ch/eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/ZPrimeToTT_M1000_W100_TuneCP2_13TeV-madgraph-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/80000/E964C281-43FB-D349-A436-9A3FDA0BAA28.root
```
:::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::: challenge

## Exercise 2: Inspect the data file

:::::::::::::: solution


::::::::::::::

::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: keypoints

- Use `.md` files for episodes when you want static content
- Use `.Rmd` files for episodes when you need to generate output
- Run `sandpaper::check_lesson()` to identify any issues with your lesson
- Run `sandpaper::build_lesson()` to preview your lesson locally

::::::::::::::::::::::::::::::::::::::::::::::::

[r-markdown]: https://rmarkdown.rstudio.com/
17 changes: 10 additions & 7 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ carpentry: 'incubator'
varnish: 'cms-opendata-workshop/varnish'

# Overall title for pages.
title: 'Modeling backgrounds in your analysis' # FIXME
title: 'Exploring CMS nanoAOD' # FIXME

# Date the lesson was created (YYYY-MM-DD, this is empty by default)
created: 2024-07-13 # FIXME

# Comma-separated list of keywords for the lesson
keywords: 'software, data, lesson, CMS, physics analysis, analysis, backgrounds' # FIXME
keywords: 'software, data, lesson, CMS, physics analysis, analysis' # FIXME

# Life cycle stage of the lesson
# possible values: pre-alpha, alpha, beta, stable
Expand All @@ -31,7 +31,7 @@ life_cycle: 'pre-alpha' # FIXME
license: 'CC-BY 4.0'

# Link to the source repository for this lesson
source: 'https://github.com/cms-opendata-workshop/workbench-template-md' # FIXME
source: 'https://github.com/cms-opendata-workshop/workshop2024-lesson-exploring-cms-nanoaod' # FIXME

# Default branch of your lesson
branch: 'main'
Expand Down Expand Up @@ -62,17 +62,20 @@ contact: '[email protected]' # FIXME
# - another-learner.md

# Order of episodes in your lesson
episodes:
episodes:
- introduction.md
- 02-nanoaod-miniaod.md
- 03-nanoaod-dataset.md
- 04-nanoaod-exercises.md

# Information for Learners
learners:
learners:

# Information for Instructors
instructors:
instructors:

# Learner Profiles
profiles:
profiles:

# Customisation ---------------------------------------------
#
Expand Down
Binary file added fig/ZprimeODP-results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added fig/ZprimeODP-results2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added fig/ZprimeODP.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added fig/ZprimeToTT-results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
96 changes: 15 additions & 81 deletions introduction.md
Original file line number Diff line number Diff line change
@@ -1,106 +1,40 @@
---
title: "Using Markdown"
title: "Introduction"
teaching: 10
exercises: 2
exercises: 0
---

:::::::::::::::::::::::::::::::::::::: questions

- How do you write a lesson using Markdown and `{sandpaper}`?
- What have we learned in the pre-exercises and how can we apply it?
- What is the structure and content of the nanoAOD format?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Explain how to use markdown with The Carpentries Workbench
- Demonstrate how to include pieces of code, figures, and nested challenge blocks
- Apply what we have learned in the pre-exercises
- Learn about the structure and content of nanoAOD

::::::::::::::::::::::::::::::::::::::::::::::::

## Introduction
## Dataformats in CMS

This is a lesson created via The Carpentries Workbench. It is written in
[Pandoc-flavored Markdown](https://pandoc.org/MANUAL.txt) for static files and
[R Markdown][r-markdown] for dynamic files that can render code into output.
Please refer to the [Introduction to The Carpentries
Workbench](https://carpentries.github.io/sandpaper-docs/) for full documentation.
Most previous releases of CMS open data have been in the Analysis Object Data (AOD) format.
This is a complex format and specific CMS software (CMSSW) is required in order to read and analyze it.

What you need to know is that there are three sections required for a valid
Carpentries lesson:
From 2015 data releases have been a slimmed-down format called MiniAOD, which has the same essential structure and software requirements for analysis as AOD. Essentially there are few
physics object collections and often the physics objects themselves are different.

1. `questions` are displayed at the beginning of the episode to prime the
learner for the content.
2. `objectives` are the learning objectives for an episode displayed with
the questions.
3. `keypoints` are displayed at the end of the episode to reinforce the
objectives.
For data released in 2016 and beyond a new format called NanoAOD is used. NanoAOD is not just simply slimmed-down MiniAOD. In contrast to AOD and MiniAOD which is stored in CMSSW C++ objects, NanoAOD is stored using ROOT TTree objects. You therefore do not need to use the CMS Virtual Machine or docker container to analyze NanoAOD data. NanoAOD can be analyzed using the ROOT program and/or python libraries capable of interpreting the ROOT's TTree structure.

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor
TO-DO we can "borrow" information from below:

Inline instructor notes can help inform instructors of timing challenges
associated with the lessons. They appear in the "Instructor View"
miniAOD links for use: [Getting started with miniAOD](https://opendata.cern.ch/docs/cms-getting-started-miniaod), [miniAOD in Workbook](https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookMiniAOD2016#High_level_physics_objects)

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
nanoAOD links for use: [Getting started with nanoAOD](https://opendata.cern.ch/docs/cms-getting-started-nanoaod)

::::::::::::::::::::::::::::::::::::: challenge

## Challenge 1: Can you do it?

What is the output of this command?

```r
paste("This", "new", "lesson", "looks", "good")
```

:::::::::::::::::::::::: solution

## Output

```output
[1] "This new lesson looks good"
```

:::::::::::::::::::::::::::::::::


## Challenge 2: how do you nest solutions within challenge blocks?

:::::::::::::::::::::::: solution

You can add a line with at least three colons and a `solution` tag.

:::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::

## Figures

You can use standard markdown for static figures with the following syntax:

`![optional caption that appears below the figure](figure url){alt='alt text for
accessibility purposes'}`

![You belong in The Carpentries!](https://raw.githubusercontent.com/carpentries/logo/master/Badge_Carpentries.svg){alt='Blue Carpentries hex person logo with no text.'}

::::::::::::::::::::::::::::::::::::: callout

Callout sections can highlight information.

They are sometimes used to emphasise particularly important points
but are also used in some lessons to present "asides":
content that is not central to the narrative of the lesson,
e.g. by providing the answer to a commonly-asked question.

::::::::::::::::::::::::::::::::::::::::::::::::


## Math

One of our episodes contains $\LaTeX$ equations when describing how to create
dynamic reports with {knitr}, so we now use mathjax to describe this:

`$\alpha = \dfrac{1}{(1 - \beta)^2}$` becomes: $\alpha = \dfrac{1}{(1 - \beta)^2}$

Cool, right?

::::::::::::::::::::::::::::::::::::: keypoints

Expand Down
Loading

0 comments on commit 550f987

Please sign in to comment.