Skip to main content

AlphaTwirl + uproot for the Z inv. width analysis

Project description

CircleCI

codecov

Z invisible analysis

This code processes CMS event-based data and simulation stored in a flat ROOT.TTree format (i.e. branches correspond to simple data types such as bool, int, float, ... or an std::vector of these data types). Typically, this is done on nanoAOD. The output is a dataframe(s) of similar data types (with the exclusion of vectors) either directly taken from the nanoAOD files or derived from these variables to create an analysis-level dataframe.

This is achieved by reading in nanoAOD files with uproot applying a set of modules to generate derived variables and storing these in a dataframe saved to disk. Yaml config files are passed to define the input data, modules and output.

Usage

Install with pip:

pip install zinv-analysis

or in editable mode to alter the code:

git clone git@github.com:shane-breeze/zinv-analysis.git
cd zinv-analysis
pip install -e .

Either run with the CLI

zinv_analysis.py --help

or the python API

import zinv
help(zinv.modules.analyse)

Layout

Interfaces

Interfaces to the underlying code is located in analyse.py and resume.py.

Scripts using these functions are found in zinv/scripts/.

Modules

A set of modules which create derived variables are found in zinv/modules/readers. These modules are applied to the data with the (alphatwirl)[https://github.com/alphatwirl/alphatwirl] package and contain a class (possibly) with the begin, event and end methods.

The begin method is run at the start of processing the data to initialise some required parameters. The EventTools module adds a register_function method to the event to allows functions to be cached for lazy-evaluation (e.g. the JEC variations function is not run if the JEC variations are not saved in the output).

The event method is applied to each iteration over the input data. This corresponds to a chunk of events which are loaded into numpy arrays with uproot. Here the derived variables are evaluated. However, because of thee lazy-evaluation this is typically blank for most modules.

The end method ia applied at the end of processing to clear up anything that needs to be cleared. If this is run in multiprocessing or batch processing mode then modules are serialised. Lambda functions are not serialisable and hence must be created with the begin method and cleared in the end method.

Output

A special module defines the output. Currently this is HDF5.py. Instead of creating derived variables, this module will evaluate the previously defined functions and store them into a .h5 file using pandas. The actual output is defined by yaml config.

Config

The yaml config is defined externally by the user and controls where the datasets are found, which modules are applied and the output into the dataframes. However, with this flexibility extra care must be taken so modules which depend on each other are defined and in the correct order. For example, if the JEC variations are saved by the HDF5 module, then the JECVariation module must be included in the sequence before the output module.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zinv-analysis-0.3.1.tar.gz (31.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zinv_analysis-0.3.1-py3-none-any.whl (45.1 kB view details)

Uploaded Python 3

File details

Details for the file zinv-analysis-0.3.1.tar.gz.

File metadata

  • Download URL: zinv-analysis-0.3.1.tar.gz
  • Upload date:
  • Size: 31.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.3

File hashes

Hashes for zinv-analysis-0.3.1.tar.gz
Algorithm Hash digest
SHA256 78458c43f16768cc524f61d747751232d075ef38ef4844d8310a40497e4745d7
MD5 d9c0c11b74b470677520fa35cdc28e2e
BLAKE2b-256 deec30a3c1e28c1ea80159e47572b606fb966c07321c45735793d9e88fe21e3a

See more details on using hashes here.

File details

Details for the file zinv_analysis-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: zinv_analysis-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 45.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.3

File hashes

Hashes for zinv_analysis-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 23f22898a326ce467c0ba8b8359f6864c3f04b94cf82defbc210567b32f97c34
MD5 f2886e79d2f34903019656cec8f4aa63
BLAKE2b-256 97d91cf589b0d237027d1bb891d6a7974cf9c34484b67ea94fbab3fa8cff60e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page