This repository contains the code and configuration files for the HH
Current working branch: submodule
.
Pull new changes in src
directory by running git submodule update --remote --merge
in the main directory. Or you could run git submodule update --init --recursive
to update the submodule to the latest commit. See here for more details. Visit CoffeaMate for more details on tools provided by src
directory.
- HH $\to b\bar{b} \tau \tau$ Analysis Repo
-
Fork the Repository:
- Go to the repository page on GitHub.
- Click the "Fork" button at the top right of the page to create a copy of the repository under your GitHub account.
-
Clone the Forked Repository:
# Clone the forked repository git clone <your-forked-repository-url> # Navigate to the project directory cd <project-directory>
Follow the instructions here to create a new repository from this template.
- Set up the environment in LPC:
- Run
source scripts/envsetup.sh
to set up a python virtual environment for the analysis. This script will install the necessary packages and set up the environment for running the analysis. It will also create a tarball of the environment for future use. - Run
source scripts/venv.sh
to set up a CMS-python environment with LCG software. This will activate the installed python virtual environment and set up the necessary environment variables for running the analysis. Do this everytime you want to work in this repo. - Run
source scripts/venv.sh --help
to see details on how to set up the environment.
- Run
- Modify the configuration files:
- Update the configuration files in the
configs
directory to reflect the correct paths to where the sample files are stored, where the output files should be saved, and other settings. projectconfig.py
contains all main configurations for the analysis, including the event selection setting, program runtime setting, the post-processing setting, and the plotting setting. In general, you could add as many new configurations (in the form of toml/yaml/json) as you want, and then load them in theprojectconfig.py
file so that you could use them in your python scripts.- check the
configs
directory for more details.
- Update the configuration files in the
- Customize event selections through creating classes inherited from the
EventSelection
in theanalysis.evtselutil
module. Checkconfigs
directory for more details. Technically this should be the only directory you need to modify for cusmizing the event selection. - Test the correct working of the event selection classes by running
when prompted, enter
source scripts/venv.sh python -m unittest tests.testproc
LPCTEST
. Change the json input file inrunsetting.toml
to the desired nanoaod file for testing.- No changes to src code should be required, as they are not related to the event selection logic but only provides utility functions and object definitions. Any src code changes should be tested with the provided unit tests in the
tests
directory. - No changes to
main.py
should be required as it only provides the main program logic for running the analysis.
- No changes to src code should be required, as they are not related to the event selection logic but only provides utility functions and object definitions. Any src code changes should be tested with the provided unit tests in the
The current curling is heavily dependent on coffea pacakges and might be subject to change in the future. Navigate to data
directory for more details.
The batch submission scripts provided in exec
are compatible with HTCondor batch system. The python program will be executed with the tarball of the python virtual environment created in the scripts/envsetup.sh
script.
Other workflow management systems have been tried (e.g. local submission through lpcjobqueue, coffea singularity shells etc.) and are not recommended for now due to various conflicts.
- Generate json job files:
cd exec
python genjobs.py
- Write jdl files:
A template of the job submission script
hhbbtt.sub
is provided in theexec
directory.
Instructions on how to install and set up the project.
git clone <repository-url-you-have-forked>
cd <project-directory>
git submodule update --init --recursive
A brief explanation of the different subdirectories in this repository:
-
src/: Contains the source code for the analysis, including modules for event selection, object definitions, and utility functions. This is the submodule
- analysis/: Modules related to the analysis logic and event selection.
- utils/: Utility functions for file handling, data processing, and other common tasks.
- config/: Configuration files for the analysis, including selection settings and parameter definitions.
-
configs/: Configuration files for the analysis, including selection settings, environment settings (e.g. paths to data files), and parameter definitions for classes and functions.
-
tests/: Contains unit tests and integration tests for the source code.
- test_filesysutil.py: Tests for the file system utility functions.
- test_custom.py: Tests for custom event selection classes.
-
data/: Contains input data files and datasets used in the analysis.
-
results/: Directory for storing the output results of the analysis, including plots, tables, and summary files.
-
scripts/: Contains scripts for setting up the (LPC/LXPLUS) environment, running the analysis, and other automation tasks.
-
notebooks/: Jupyter notebooks for exploratory data analysis, visualization, and prototyping.