jet-universe
/

sophon-ak4

@@ -42,10 +42,10 @@ The **SophonAK4** model, together with the [**Sophon**](https://huggingface.co/j
 ## Evaluation
-The performance of SophonAK4 is evaluated using the standard model \\(t\bar{t}\\) events to enable direct comparison with performance benchmarks from ATLAS and CMS.
-Details are provided in the [[Appendix B of the paper]](https://arxiv.org/html/2503.00118#:~:text=B.2,Performance%20of%20SophonAK4), and are summarized below.
-For *b*- and *c*-tagging, genuine *b*, *c*, and light-flavor jets are selected via jet-parton matching as implemented in Delphes. Jets are required to satisfy *p*<sub>T</sub> > 30 GeV and |*η*| < 2.5, consistent with CMS configurations.
 The following *b*-tagging discriminant is constructed from **SophonAK4**'s the raw output scores to evaluate *b* vs. light and *b* vs. *c* jet performance:
@@ -58,16 +58,23 @@ The following *c*-tagging discriminants are defined for *c* vs. light and *c* vs
 1. The ROC performance for *b* vs. light/*c* jets and *c* vs. light/*b* jets is shown below and can be compared to [CMS benchmarks](https://cds.cern.ch/record/2904702) (Figs. 1 and 3 for the *tt̅* process).
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6369850885fef3ca96e6dd63/H5FPFm4ZnwVbNshrPQDg_.png)
 > Conclusion
->  - The *b* vs. light jet performance is slightly below that of the widely-adopted DeepJet tagger in CMS.
->  - The *b* vs. *c* and *c* vs. light/*b* jet performances fall between DeepJet and ParticleNet taggers in CMS.
->  - Similar trends are found by comparing with ATLAS's widely-adopted DL1r tagger, see [Appendix B of the paper](https://arxiv.org/html/2503.00118#:~:text=B.2,Performance%20of%20SophonAK4).
 2. Performance across different *p*<sub>T</sub> and |*η*| regions is benchmarked below and can be compared with [CMS benchmarks](https://cds.cern.ch/record/2904702) (Figs. 17, 19, 21, 23, 25, 27, 29, and 31).
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6369850885fef3ca96e6dd63/XTAce2zjvwzLhe-_OL29r.png)
 > Conclusion
 >  - Tagging performance degrades in the low-*p*<sub>T</sub> and high-|*η*| regions but reaches the plateau beyond the turn-on point, indicating that the **SophonAK4** tagger exhibits realistic flavor-tagging behavior across kinematic regimes.

 ## Evaluation
+The performance of SophonAK4 is evaluated using the standard model *tt̅* events to enable direct comparison with performance benchmarks from ATLAS and CMS.
+Details are summarized below.
+For *b*- and *c*-tagging, genuine *b*, *c*, and light-flavor jets are selected via ghost-association following the CMS convention. Jets are required to satisfy *p*<sub>T</sub> > 30 GeV and |*η*| < 2.5, consistent with CMS configurations.
 The following *b*-tagging discriminant is constructed from **SophonAK4**'s the raw output scores to evaluate *b* vs. light and *b* vs. *c* jet performance:
 1. The ROC performance for *b* vs. light/*c* jets and *c* vs. light/*b* jets is shown below and can be compared to [CMS benchmarks](https://cds.cern.ch/record/2904702) (Figs. 1 and 3 for the *tt̅* process).
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6369850885fef3ca96e6dd63/VgwRLCuQP2wJoKaPhNhrV.png)
 > Conclusion
+>  - The *b* and *c* tagging performance of SophonAK4, evaluated on the Delphes simulation, is very compatible with that of the CMS taggers. Its performance falls between **DeepJet** and **UParT**, and is generally comparable to **ParticleNet**.
+<details>
+<summary>Click here to show the ROC curves with CMS results overlaid.</summary>
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6369850885fef3ca96e6dd63/0hdH8ADxSZwKOmIluca_L.png)
+</details>
 2. Performance across different *p*<sub>T</sub> and |*η*| regions is benchmarked below and can be compared with [CMS benchmarks](https://cds.cern.ch/record/2904702) (Figs. 17, 19, 21, 23, 25, 27, 29, and 31).
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6369850885fef3ca96e6dd63/wUWfD_D-Y8phSfsc4-ey2.png)
 > Conclusion
 >  - Tagging performance degrades in the low-*p*<sub>T</sub> and high-|*η*| regions but reaches the plateau beyond the turn-on point, indicating that the **SophonAK4** tagger exhibits realistic flavor-tagging behavior across kinematic regimes.