rkansal47 / HH4b

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HH4b

Codestyle pre-commit.ci status

Search for two boosted (high transverse momentum) Higgs bosons (H) decaying to four beauty quarks (b).

Setting up package

Creating a virtual environment

First, create a virtual environment (mamba is recommended):

```bash
# Download the mamba setup script (change if needed for your machine https://github.com/conda-forge/miniforge#mambaforge)
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
# Install: (the mamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota)
./Mambaforge-Linux-x86_64.sh  # follow instructions in the installation
mamba create -n hh4b python=3.10
mamba activate hh4b

Installing package

# Clone the repository
git clone https://github.com/LPC-HH/HH4b.git
cd HH4b
# Perform an editable installation
pip install -e .
# for committing to the repository
pip install pre-commit
pre-commit install

Troubleshooting

  • If your default python in your environment is not Python 3, make sure to use pip3 and python3 commands instead.

  • You may also need to upgrade pip to perform the editable installation:

python3 -m pip install -e .

Instructions for running coffea processors

Setup

For submitting to condor, all you need is python >= 3.7.

For running locally:

# Download the mamba setup script (change if needed for your machine https://github.com/conda-forge/miniforge#mambaforge)
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
# Install: (the mamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota)
./Mambaforge-Linux-x86_64.sh  # follow instructions in the installation
mamba create -n hh4b python=3.10
mamba activate hh4b
pip install coffea

Clone the repository:

git clone https://github.com/LPC-HH/HH4b/
pip install -e .

Running locally

To test locally first (recommended), can do e.g.:

mkdir outfiles
python -W ignore src/run.py --starti 0 --endi 1 --year 2022 --processor skimmer --samples QCD --subsamples "QCD_PT-470to600"
python -W ignore src/run.py --processor skimmer --year 2022 --nano-version v11_private --samples HH --subsamples GluGlutoHHto4B_kl-1p00_kt-1p00_c2-0p00_TuneCP5_13p6TeV_TSG --starti 0 --endi 1
python -W ignore src/run.py  --year 2022 --processor trigger_boosted --samples Muon --subsamples Run2022C --nano_version v11_private --starti 0 --endi 1

Parquet and pickle files will be saved. Pickles are in the format {'nevents': int, 'cutflow': Dict[str, int]}.

Or on a specific file(s):

FILE=/eos/uscms/store/user/rkansal/Hbb/nano/Run3Winter23NanoAOD/QCD_PT-15to7000_TuneCP5_13p6TeV_pythia8/02c29a77-3e0e-40e0-90a1-0562f54144e9.root
python -W ignore src/run.py --processor skimmer --year 2023 --files $FILE --files-name QCD

Jobs

The script src/condor/submit.py manually splits up the files into condor jobs:

On a full dataset: e.g. TAG=23Jul13

python src/condor/submit.py --processor skimmer --tag $TAG --files-per-job 20 --submit

On a specific sample:

python src/condor/submit.py --processor skimmer --tag $TAG --nano-version v11_private --samples HH --subsamples GluGlutoHHto4B_kl-1p00_kt-1p00_c2-0p00_TuneCP5_13p6TeV_TSG

Over many samples, using a yaml file:

nohup python src/condor/submit_from_yaml.py --tag $TAG --processor skimmer --save-systematics --submit --yaml src/condor/submit_configs/${YAML}.yaml &> tmp/submitout.txt &

To Submit (if not using the --submit flag):

nohup bash -c 'for i in condor/'"${TAG}"'/*.jdl; do condor_submit $i; done' &> tmp/submitout.txt &

Dask

Log in with ssh tunneling:

ssh -L 8787:localhost:8787 cmslpc-sl7.fnal.gov

Run the ./shell script as setup above via lpcjobqueue:

./shell coffeateam/coffea-dask:0.7.21-fastjet-3.4.0.1-g6238ea8

Renew your grid certificate:

voms-proxy-init --rfc --voms cms -valid 192:00

Run the job submssion script:

python -u -W ignore src/run.py --year 2022EE --yaml src/condor/submit_configs/skimmer_23_10_02.yaml --processor skimmer --nano-version v11 --region signal --save-array --executor dask > dask.out 2>&1

Condor Scripts

Check jobs

Check that all jobs completed by going through output files:

for year in 2016APV 2016 2017 2018; do python src/condor/check_jobs.py --tag $TAG --processor trigger (--submit) --year $year; done

e.g.

python src/condor/check_jobs.py --year 2018 --tag Oct9 --processor matching --check-running --user cmantill --submit-missing

Combine pickles

Combine all output pickles into one:

for year in 2016APV 2016 2017 2018; do python src/condor/combine_pickles.py --tag $TAG --processor trigger --r --year $year; done

Post-processing

Create Datacard

Need root==6.22.6, and https://github.com/rkansal47/rhalphalib installed (see below for how to install both).

python3 postprocessing/CreateDatacard.py --templates-dir templates/$TAG --model-name $TAG

Combine

CMSSW + Combine Quickstart

cmsrel CMSSW_11_3_4
cd CMSSW_11_3_4/src
cmsenv
# need my fork until regex for float parameters is merged and released
git clone -b regex-float-parameters https://github.com/rkansal47/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
git clone -b v2.0.0 https://github.com/cms-analysis/CombineHarvester.git CombineHarvester
# Important: this scram has to be run from src dir
scramv1 b clean; scramv1 b
# rhalphalib to create the datacards:
git clone https://github.com/rkansal47/rhalphalib
cd rhalphalib
pip install -e .  # editable installation

I also add the combine folder to my PATH in my .bashrc for convenience:

export PATH="$PATH:/uscms_data/d1/rkansal/hh4b/HH4b/src/HH4b/combine"

Run fits and diagnostics locally

All via the below script, with a bunch of options (see script):

run_blinded_hh4b.sh --workspace --bfit --limits

About

License:MIT License


Languages

Language:Jupyter Notebook 89.0%Language:Python 10.7%Language:Shell 0.4%