BioGeek commited on
Commit
916f201
·
verified ·
1 Parent(s): b826cbb

Fix code, update citation references.

Browse files
Files changed (1) hide show
  1. README.md +42 -6
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
  ## Winnow HeLa Single Shot Probability Calibrator
15
 
16
  **Winnow** recalibrates confidence scores and provides FDR control for *de novo* peptide sequencing (DNS) workflows.
17
- This repository contains the calibrator trained on HeLa Single Shot data as referenced in our paper: TODO.
18
 
19
  - Intended inputs: spectrum input data and corresponding MS/MS PSM results produced by InstaNovo
20
  - Outputs: calibrated per-PSM probabilities in `calibrated_confidence`.
@@ -38,9 +38,10 @@ from winnow.scripts.main import filter_dataset
38
  from winnow.fdr.nonparametric import NonParametricFDRControl
39
 
40
  # 1) Download model files
 
41
  snapshot_download(
42
  repo_id="InstaDeepAI/winnow-helaqc-model",
43
- allow_patterns=["*.pkl"]),
44
  repo_type="model",
45
  local_dir=helaqc_model,
46
  )
@@ -50,8 +51,8 @@ calibrator = ProbabilityCalibrator.load(helaqc_model)
50
 
51
  # 3) Load your dataset (InstaNovo-style config)
52
  dataset = InstaNovoDatasetLoader().load(
53
- "path_to_spectrum_data.parquet",
54
- "path_to_instanovo_predictions.csv",
55
  )
56
  dataset = filter_dataset(dataset) # standard Winnow filtering
57
 
@@ -118,5 +119,40 @@ winnow predict \
118
 
119
  ## Citation
120
 
121
- If you use Winnow or this model, please cite:
122
- TODO
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ## Winnow HeLa Single Shot Probability Calibrator
15
 
16
  **Winnow** recalibrates confidence scores and provides FDR control for *de novo* peptide sequencing (DNS) workflows.
17
+ This repository contains the calibrator trained on HeLa Single Shot data as referenced in our paper: [De novo peptide sequencing rescoring and FDR estimation with Winnow](https://arxiv.org/abs/2509.24952).
18
 
19
  - Intended inputs: spectrum input data and corresponding MS/MS PSM results produced by InstaNovo
20
  - Outputs: calibrated per-PSM probabilities in `calibrated_confidence`.
 
38
  from winnow.fdr.nonparametric import NonParametricFDRControl
39
 
40
  # 1) Download model files
41
+ helaqc_model = Path("helaqc_model")
42
  snapshot_download(
43
  repo_id="InstaDeepAI/winnow-helaqc-model",
44
+ allow_patterns=["*.pkl"],
45
  repo_type="model",
46
  local_dir=helaqc_model,
47
  )
 
51
 
52
  # 3) Load your dataset (InstaNovo-style config)
53
  dataset = InstaNovoDatasetLoader().load(
54
+ data_path="path_to_spectrum_data.parquet",
55
+ predictions_path="path_to_instanovo_predictions.csv",
56
  )
57
  dataset = filter_dataset(dataset) # standard Winnow filtering
58
 
 
119
 
120
  ## Citation
121
 
122
+ If you use `winnow` in your research, please cite our preprint: [De novo peptide sequencing rescoring and FDR estimation with Winnow](https://arxiv.org/abs/2509.24952)
123
+
124
+ ```bibtex
125
+ @article{mabona2025novopeptidesequencingrescoring,
126
+ title={De novo peptide sequencing rescoring and FDR estimation with Winnow},
127
+ author={Amandla Mabona and Jemma Daniel and Henrik Servais Janssen Knudsen and Rachel Catzel
128
+ and Kevin Michael Eloff and Erwin M. Schoof and Nicolas Lopez Carranza and Timothy P. Jenkins
129
+ and Jeroen Van Goey and Konstantinos Kalogeropoulos},
130
+ year={2025},
131
+ eprint={2509.24952},
132
+ archivePrefix={arXiv},
133
+ primaryClass={q-bio.QM},
134
+ url={https://arxiv.org/abs/2509.24952},
135
+ }
136
+ ```
137
+
138
+ If you use the `InstaNovo` model to generate predictions, please also cite: [InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments](https://doi.org/10.1038/s42256-025-01019-5)
139
+
140
+ ```bibtex
141
+ @article{eloff_kalogeropoulos_2025_instanovo,
142
+ title = {InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale
143
+ proteomics experiments},
144
+ author = {Eloff, Kevin and Kalogeropoulos, Konstantinos and Mabona, Amandla and Morell,
145
+ Oliver and Catzel, Rachel and Rivera-de-Torre, Esperanza and Berg Jespersen,
146
+ Jakob and Williams, Wesley and van Beljouw, Sam P. B. and Skwark, Marcin J.
147
+ and Laustsen, Andreas Hougaard and Brouns, Stan J. J. and Ljungars,
148
+ Anne and Schoof, Erwin M. and Van Goey, Jeroen and auf dem Keller, Ulrich and
149
+ Beguir, Karim and Lopez Carranza, Nicolas and Jenkins, Timothy P.},
150
+ year = 2025,
151
+ month = {Mar},
152
+ day = 31,
153
+ journal = {Nature Machine Intelligence},
154
+ doi = {10.1038/s42256-025-01019-5},
155
+ issn = {2522-5839},
156
+ url = {https://doi.org/10.1038/s42256-025-01019-5}
157
+ }
158
+ ```