Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,89 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- jet-universe/jetclass2
|
| 5 |
+
tags:
|
| 6 |
+
- particle physics
|
| 7 |
+
- jet tagging
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# Model Card: SophonAK4
|
| 11 |
+
|
| 12 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
| 13 |
+
|
| 14 |
+
The **SophonAK4** model is a *realistic* small-radius (*R* = 0.4) anti-*k*<sub>T</sub> jet tagger developed for [fast-simulation (Delphes) datasets under the JetClass-II configuration](https://github.com/jet-universe/jetclass2_generation?tab=readme-ov-file#delphes-step), designed to emulate the CMS detector conditions at the LHC.
|
| 15 |
+
|
| 16 |
+
Here, *realistic* indicates that the model achieves tagging performance comparable to state-of-the-art jet taggers used in the ATLAS and CMS experiments.
|
| 17 |
+
|
| 18 |
+
The model is constructed to cover a broad range of final states, including partons and leptons of various flavors and charges.
|
| 19 |
+
|
| 20 |
+
## Model Details
|
| 21 |
+
|
| 22 |
+
**SophonAK4** is trained using a multi-class classification approach based on di-*X* resonance processes, where the resonance *X* decays into multiple *two-prong* final states. Truth labelling is performed by associating reconstructed anti-k<sub>T</sub> jets with partons or leptons originating from these two-prong decays.
|
| 23 |
+
|
| 24 |
+
A total of 23 jet labels are defined:
|
| 25 |
+
|
| 26 |
+
- **Single-prong labels**: \\(b\\), \\(\bar{b}\\), \\(c\\), \\(\bar{c}\\), \\(s\\), \\(\bar{s}\\), \\(d\\), \\(\bar{d}\\), \\(u\\), \\(\bar{u}\\), \\(g\\), \\(e^-\\), \\(e^+\\), \\(\mu^-\\), \\(\mu^+\\), \\(\tau_{\rm h}^-\\), and \\(\tau_{\rm h}^+\\). These correspond to cases where a single truth particle (either a parton or a lepton) is matched to the jet within Δ*R*(jet, particle) < 0.4, while the other particle from the same resonance decay is not matched to the jet.
|
| 27 |
+
|
| 28 |
+
- **Two-prong labels**: \\(b\bar{b}\\), \\(c\bar{c}\\), \\(s\bar{s}\\), \\(d\bar{d}\\), \\(u\bar{u}\\), and \\(gg\\). These labels are assigned when both particles from the same resonance decay are matched within the same jet.
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
## Uses
|
| 32 |
+
|
| 33 |
+
### Integrating SophonAK4/Sophon Models
|
| 34 |
+
|
| 35 |
+
The **SophonAK4** model, together with the [**Sophon**](https://huggingface.co/jet-universe/sophon) model, provides a realistic benchmark for jet tagging on fast-simulation (Delphes) datasets, achieving performance comparable to state-of-the-art taggers used in the ATLAS and CMS experiments.
|
| 36 |
+
|
| 37 |
+
- For an example of integrating them in C++ workflows to analyze Delphes files, check [[here]](https://github.com/jet-universe/sophon?tab=readme-ov-file#using-sophon-model-pythonc). (note: the SophonAK4 model will be supported since April 25')
|
| 38 |
+
|
| 39 |
+
- For an example of how to integrate these models into the Delphes processing workflow, refer to the following GitHub repository: https://github.com/jet-universe/delphes/tree/jet-models (note: will be available since May 25')
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
## Evaluation
|
| 43 |
+
|
| 44 |
+
The performance of SophonAK4 is evaluated using the standard model \\(t\bar{t}\\) events to enable direct comparison with performance benchmarks from ATLAS and CMS.
|
| 45 |
+
Details are provided in the [[Appendix B of the paper]](https://arxiv.org/html/2503.00118#:~:text=B.2,Performance%20of%20SophonAK4), and are summarized below.
|
| 46 |
+
|
| 47 |
+
For *b*- and *c*-tagging, genuine *b*, *c*, and light-flavor jets are selected via jet-parton matching as implemented in Delphes. Jets are required to satisfy *p*<sub>T</sub> > 30 GeV and |*η*| < 2.5, consistent with CMS configurations.
|
| 48 |
+
|
| 49 |
+
The following *b*-tagging discriminant is constructed from **SophonAK4**'s the raw output scores to evaluate *b* vs. light and *b* vs. *c* jet performance:
|
| 50 |
+
|
| 51 |
+
- \\(\text{discr (SophonAK4 $b$ tagging)} = g_{b} + g_{\bar{b}} + g_{b\bar{b}}.\\)
|
| 52 |
+
|
| 53 |
+
The following *c*-tagging discriminants are defined for *c* vs. light and *c* vs. *b* jets, respectively.
|
| 54 |
+
|
| 55 |
+
- \\(\text{discr (SophonAK4 $c$ tagging)} = g_{c} + g_{\bar{c}} + g_{c\bar{c}},\\)
|
| 56 |
+
- \\(\text{discr (SophonAK4 $c$ vs. $b$ tagging)} = \frac{g_{c} + g_{\bar{c}} + g_{c\bar{c}}}{g_{c} + g_{\bar{c}} + g_{c\bar{c}} + g_{b} + g_{\bar{b}} + g_{b\bar{b}}}.\\)
|
| 57 |
+
|
| 58 |
+
1. The ROC performance for *b* vs. light/*c* jets and *c* vs. light/*b* jets is shown below and can be compared to [CMS benchmarks](https://cds.cern.ch/record/2904702) (Figs. 1 and 3 for the *tt̅* process).
|
| 59 |
+
|
| 60 |
+

|
| 61 |
+
|
| 62 |
+
> Conclusion
|
| 63 |
+
> - The *b* vs. light jet performance is slightly below that of the widely-adopted DeepJet tagger in CMS.
|
| 64 |
+
> - The *b* vs. *c* and *c* vs. light/*b* jet performances fall between DeepJet and ParticleNet taggers in CMS.
|
| 65 |
+
> - Similar trends are found by comparing with ATLAS's widely-adopted DL1r tagger, see [Appendix B of the paper](https://arxiv.org/html/2503.00118#:~:text=B.2,Performance%20of%20SophonAK4).
|
| 66 |
+
|
| 67 |
+
2. Performance across different *p*<sub>T</sub> and |*η*| regions is benchmarked below and can be compared with [CMS benchmarks](https://cds.cern.ch/record/2904702) (Figs. 17, 19, 21, 23, 25, 27, 29, and 31).
|
| 68 |
+
|
| 69 |
+

|
| 70 |
+
|
| 71 |
+
> Conclusion
|
| 72 |
+
> - Tagging performance degrades in the low-*p*<sub>T</sub> and high-|*η*| regions but reaches the plateau beyond the turn-on point, indicating that the **SophonAK4** tagger exhibits realistic flavor-tagging behavior across kinematic regimes.
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
## Citation
|
| 76 |
+
|
| 77 |
+
If you find the SophonAK4 model useful in your research, please cite:
|
| 78 |
+
|
| 79 |
+
```
|
| 80 |
+
@article{Zhao:2025rci,
|
| 81 |
+
author = "Zhao, Yuzhe and Li, Congqiao and Agapitos, Antonios and Fu, Dawei and Gao, Leyun and Mao, Yajun and Li, Qiang",
|
| 82 |
+
title = "{Novel $|V_{cb}|$ extraction method via boosted $bc$-tagging with in-situ calibration}",
|
| 83 |
+
eprint = "2503.00118",
|
| 84 |
+
archivePrefix = "arXiv",
|
| 85 |
+
primaryClass = "hep-ph",
|
| 86 |
+
month = "2",
|
| 87 |
+
year = "2025"
|
| 88 |
+
}
|
| 89 |
+
```
|