You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Transforming Simulation to Data Without Pairing

This repository contains the code to accompany our paper/poster "Transforming Simulation to Data Without Pairing" in the Machine Learning for Physical Sciences (ML4PS) workshop at NeurIPS 2024. Please see our paper to learn more: https://arxiv.org/abs/2504.12343.

Overview

Our code trains and evaluates a conditional normalizing flow (CNF) that maps between Monte Carlo (MC) and data samples of $H\to\gamma\gamma$ background events at the Large Hadron Collider (LHC). We would like to show that the CNF can replace the detector response simulation (one of the most expensive steps in the MC simulation pipeline) by converting a truth-level MC simulation directly to an estimated data sample. The challenge with this task is that the MC and data samples are unpaired (i.e., are not truth-level and detector-level representations of the same events), so we do not expect the model to learn the detector response on individual events. Instead, the model must learn a distribution of corrections conditioned on the MC events such that the distributions of corrections plus conditions matches the target distribution.

Code

Normalizing Flow (`nf` directory)

The nf directory contains the code to train and evaluate a CNF along with further documentation describing how to run the code and adjust hyperparameters. The main training script train_cond.py implements several different loss functions for the CNF: the log loss (standard loss function for normalizing flows), the KL divergence evaluated on 1D projections, the Maximum Mean Discrepancy, and the Modified Differential Multiplier Method. We found that the Modified Differential Multiplier Method offers the best performance because it enables the model to simultaneously learn the distributions for individual observables and the correlations between observables.

Discriminating Neural Network (`classifier` directory)

As an additional test of the CNF's performance, we trained a discriminating neural network ("classifier") to separate generated events from data events. Poor classifier performance corresponds to good CNF performance and vice versa. All the code relevant to training and evaluating the classifiers is contained in the classifier directory, along with further documentation.

Acknowledgements

Much of the code in this repository is based on code developed for an earlier study of normalizing flows, which is available at https://github.com/allixu/normalizing_flow_for_detector_response. Please see the references section of our paper for additional citations.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support