generated from cms-opendata-workshop/workbench-template-md
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d553dc5
commit bdf198d
Showing
8 changed files
with
157 additions
and
73 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,7 +25,7 @@ keywords: 'software, data, CMS, hackathon' | |
|
||
# Life cycle stage of the lesson | ||
# possible values: pre-alpha, alpha, beta, stable | ||
life_cycle: 'alpha' # FIXME | ||
life_cycle: 'beta' # FIXME | ||
|
||
# License of the lesson | ||
license: 'CC-BY 4.0' | ||
|
@@ -63,10 +63,11 @@ contact: '[email protected]' | |
|
||
# Order of episodes in your lesson | ||
episodes: | ||
- 01-particlediscoverylab.md | ||
- 02-ppp.md | ||
- 03-ml.md | ||
- 04-agc.md | ||
- 01-ppp.md | ||
- 02-pdl.md | ||
- 03-ml-1.md | ||
- 04-ml-2.md | ||
- 05-agc.md | ||
|
||
# Information for Learners | ||
learners: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
title: "Machine Learning with Open Data" | ||
teaching: 5 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- How can machine learning be applied to particle physics data? | ||
- What are the steps involved in preparing data for machine learning analysis? | ||
- How do we train and evaluate a machine learning model in this context? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Learn the basics of machine learning and its applications in particle physics. | ||
- Understand the process of preparing data for machine learning. | ||
- Gain practical experience in training and evaluating a machine learning model. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
# Introduction to Machine Learning in HEP | ||
|
||
This lesson guides participants through applying machine learning techniques to CMS Open Data. It's designed for those with some programming experience and an interest in machine learning. | ||
|
||
### Overview | ||
|
||
The Machine Learning in High-Energy Physics (HEP) activity provides participants with an introduction to the exciting intersection of machine learning and particle physics. By leveraging CMS Open Data, participants will learn how to apply machine learning algorithms to real-world data, enhancing their analytical skills and understanding of both fields. | ||
|
||
Machine learning plays a crucial role in analyzing vast amounts of data generated by experiments in High-Energy Physics (HEP). It enables researchers to extract meaningful insights, classify particle collisions, and discover new physics phenomena efficiently. | ||
|
||
# Data Preparation | ||
A crucial step in any machine learning project is data preparation. Participants will learn how to clean and preprocess CMS Open Data to make it suitable for machine learning algorithms. This includes handling missing values, normalizing data, and creating training and test datasets. | ||
|
||
## Supervised Learning in HEP | ||
|
||
- Definition: Supervised learning involves training a model on a labeled dataset, where each input data point is paired with its corresponding target label or output. | ||
- Objective: The goal is to learn a mapping from inputs to outputs, based on the labeled examples provided during training. | ||
- Examples: Classification (predicting categories), regression (predicting continuous values), and sequence prediction tasks are common supervised learning problems. | ||
- Process: Models are trained using algorithms that minimize the error between predicted and actual outputs, adjusting parameters to improve accuracy. | ||
|
||
## Unsupervised Learning in HEP | ||
|
||
- Definition: Unsupervised learning involves training a model on an unlabeled dataset, where the algorithm tries to identify patterns, relationships, or structures in the data without explicit guidance. | ||
- Objective: The goal is to explore the data and extract meaningful insights, such as clusters, associations, or anomalies. | ||
- Examples: Clustering (grouping similar data points), anomaly detection (identifying unusual patterns), and dimensionality reduction (reducing the number of features while preserving important information) are common unsupervised learning tasks. | ||
- Process: Algorithms in unsupervised learning rely on statistical properties of the data to uncover patterns or structures. They do not aim to predict specific outputs but rather to understand the inherent structure of the data. | ||
|
||
# Model Training and Evaluation | ||
|
||
Participants will gain hands-on experience in training and evaluating machine learning models. This includes selecting appropriate algorithms, tuning hyperparameters, and assessing model performance using metrics such as accuracy, precision, recall, and F1 score. | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- Introduction to machine learning in particle physics. | ||
- Data preparation for machine learning analysis. | ||
- Model training and evaluation techniques. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
--- | ||
title: "Machine Learning with Open Data" | ||
teaching: 5 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- How can machine learning be applied to particle physics data? | ||
- What are the steps involved in preparing data for machine learning analysis? | ||
- How do we train and evaluate a machine learning model in this context? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Learn the basics of machine learning and its applications in particle physics. | ||
- Understand the process of preparing data for machine learning. | ||
- Gain practical experience in training and evaluating a machine learning model. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
## Practical Application | ||
|
||
CNNs (Convolutional Neural Networks) and autoencoders are both types of neural networks, but they serve different purposes and have distinct architectures: | ||
|
||
### CNN (Convolutional Neural Network): | ||
|
||
- Purpose: CNNs are primarily used for supervised learning tasks such as image classification, object detection, and image segmentation. | ||
- Architecture: CNNs consist of convolutional layers that apply learnable filters to input data, capturing spatial hierarchies of features. They typically include pooling layers to reduce spatial dimensions and dense (fully connected) layers for final classification or regression. | ||
- Training: CNNs are trained with labeled data, optimizing parameters to minimize classification error or regression loss. | ||
- Applications: CNNs are widely used in computer vision tasks where spatial relationships and local patterns in data (such as images) are important. | ||
|
||
|
||
### Autoencoders: | ||
|
||
- Purpose: Autoencoders are used for unsupervised learning tasks such as dimensionality reduction, feature learning, and anomaly detection. | ||
- Architecture: An autoencoder consists of an encoder network that compresses the input data into a latent representation and a decoder network that reconstructs the input from this representation. Convolutional layers can be used in convolutional autoencoders (CAEs) for image data. | ||
- Training: Autoencoders are trained on unlabeled data, learning to reconstruct the input data effectively. They are optimized based on reconstruction error or other metrics that measure the quality of the reconstructed output. | ||
- Applications: Autoencoders are applied in tasks where finding underlying patterns in data or reducing its dimensionality is beneficial, such as in denoising data, anomaly detection, and feature extraction. | ||
Key Differences: | ||
|
||
### Supervised vs Unsupervised: | ||
|
||
- CNNs are supervised learning models that require labeled data for training, while autoencoders are unsupervised models that learn from unlabeled data. | ||
- Output: CNNs produce predictions (class labels or regression values) based on input data, whereas autoencoders reconstruct input data or extract meaningful representations from it. | ||
- Use Cases: CNNs are suitable for tasks requiring classification or regression on structured data like images, whereas autoencoders are used for tasks involving data exploration, anomaly detection, or preprocessing. | ||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- Introduction to machine learning in particle physics. | ||
- Data preparation for machine learning analysis. | ||
- Model training and evaluation techniques. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters