ESMCBA Evolutionary Scale Modeling Binding Affinity - Model for binding affinity prediction of the peptide-MHC interaction
Code: https://github.com/sermare/ESMCBA
Models: https://huggingface.co/smares/ESMCBA
This guide shows a new user how to:
- get the code
- download one or more model checkpoints from Hugging Face
- run predictions and get Hidden Layer embeddings with
embeddings.py - understand where files are stored and how to point to them
Quick Start (Run it in Collab)
You can access this notebook to run with google collab:
π Quick Start with pip
ESMCBA is now available on PyPI! Install it with a single command:
pip install esmcba
Basic Usage
Once installed, you can run predictions directly from the command line:
esmcba --hla A0201 \
--peptides KIQEGVVDYGA VLMSNLGMPS DTLRVEAFEYY \
--encoding epitope \
--output_dir ./outputs
Complete Example
Here's a full example with multiple peptides for HLA-A*02:01:
esmcba --hla A0201 \
--peptides KIQEGVVDYGA VLMSNLGMPS DTLRVEAFEYY AKKPTETI FKLNIKLLGVG \
ETSNSFDVLK INVIVFDGKSK VDFCGKGYHLM AYPLTKHPNQ RAMPNMLRI \
FIASFRLFA YIFFASFYYV SLIDFYLCFL FLTENLLLYI YMPYFFTLL \
FLLPSLATV FLAFLLFLV YFIASFRLFA FFFLYENAFL FLIGCNYLG \
YLATALLTL FLHFLPRV YLCFLAFLLF YLKLTDNVYI KLMGHFAWWT \
TLMNVLTLV YLTNDVSFL FLPFAMGI LLADKFPV SMWSFNPET \
LLMPILTLT LVAEWFLAYI FLYLYALVYF LMSFTVL MWLSYFIA \
FLNGSCGSV LVLSVNPYV GLCVDIPGI \
--encoding epitope \
--output_dir ./outputs
Output Files
After running, you'll find in your output directory:
A0201-ESMCBA_embeddings.npy- Raw ESM embeddingsA0201-ESMCBA_umap.csv- UMAP visualization coordinates
Available Options
esmcba --help
Key parameters:
--hla: HLA allele (e.g., A0201, B1402, C0501)--peptides: Space-separated list of peptide sequences--encoding: Encoding type (epitopeorhla, default:epitope)--output_dir: Directory for output files (default:./outputs)--batch_size: Batch size for inference (default: 10)--umap_dims: UMAP dimensions, 2 or 3 (default: 2)
1. Requirements
- Python 3.9 or newer
- PyTorch 1.13+ or 2.x
huggingface_hubfor downloads
Install the basics:
# Install core PyTorch and Transformers ecosystem
pip install torch
pip install transformers
pip install esm
# Install Hugging Face Hub utilities
pip install "huggingface-hub<1.0"
# Optional: Install hf_transfer for faster large file downloads
pip install hf_transfer
pip install biopython umap-learn scikit-learn seaborn pandas matplotlib
2. Get the code
git clone https://github.com/sermare/ESMCBA
Inside the repo you should have embeddings.py available. If your file is named embeddings_generation.py, use that name instead in the commands below.
3. Pick a model checkpoint (all models at the end of this Markdown)
All checkpoints live in the model repo: smares/ESMCBA.
Examples of filenames:
ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAB1402_2_1e-05_1e-06__1_B1402_0404_Hubber_B1402_final.pthESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_6_HLAB1503_2_0.0001_1e-05__2_B1503_0404_Hubber_B1503_final.pthESMCBA_epitope_0.5_20_ESMMASK_epitope_FT_15_0.0001_1e-05_AUG_6_HLAB5101_5_0.001_1e-06__3_B5101_Hubber_B5101_final.pth
You can browse all files here: https://huggingface.co/smares/ESMCBA
4. Download a checkpoint
Option A β download to a folder next to the code
# download everything to ./models
hf download smares/ESMCBA --repo-type model --local-dir ./models
#or just get one model
huggingface-cli download smares/ESMCBA \
"ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_5_0.001_1e-06_AUG_6_HLAA0201_2_0.001_1e-06__2_A0201_Hubber_A0201_final.pth" \
--repo-type model \
--local-dir ./models
Option B β rely on the Hugging Face cache
If you omit --local-dir, files go into your HF cache, for example:
~/.cache/huggingface/hub/
To move the cache:
export HF_HOME=/path/to/cache
5. Run embeddings.py
Below are example invocations. Replace the checkpoint filename and HLA tag with one that matches your use case.
Example 1 β using a file you downloaded to ./models
python3 embeddings_generation.py --model_path ./models/ESMCBA_epitope_0.5_20_ESMMASK_epitope_FT_15_0.0001_1e-05_AUG_6_HLAB5101_5_0.001_1e-06__3_B5101_Hubber_B5101_final.pth --name B5101-ESMCBA --hla B5101 --encoding epitope --output_dir ./outputs --peptides ASCQQQRAGHS ASCQQQRAGH ASCQQQRAG DVRLSAHHHR DVRLSAHHHRM GHSDVRLSAHH
Example 2 β let the script pull from the Hub automatically
If embeddings_generation.py supports resolving from the Hub, you can pass either a file name or an hf:// path and let the script download to cache.
cd ESMCBA/ESMCBA
python3 embeddings_generation.py --model_path "ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAB1402_2_1e-05_1e-06__1_B1402_0404_Hubber_B1402_final.pth" --name B1402-ESMCBA --hla B1402 --encoding epitope --output_dir ./outputs --peptides ASCQQQRAGHS ASCQQQRAGH ASCQQQRAG DVRLSAHHHR DVRLSAHHHRM GHSDVRLSAHH
or
python3 embeddings_generation.py --model_path "hf://smares/ESMCBA/ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAB1402_2_1e-05_1e-06__1_B1402_0404_Hubber_B1402_final.pth" --name B1402-ESMCBA --hla B1402 --encoding epitope --output_dir ./outputs --peptides ASCQQQRAGHS ASCQQQRAGH ASCQQQRAG DVRLSAHHHR DVRLSAHHHRM GHSDVRLSAHH
GPU vs CPU
- By default PyTorch will use GPU if available.
- To force CPU, set
CUDA_VISIBLE_DEVICES=""or modifyembeddings.pyto passmap_location="cpu"totorch.load.
6. Minimal pattern inside embeddings.py to resolve paths
If you want the script to accept local files, simple names, or hf://:
import os
from huggingface_hub import hf_hub_download
def resolve_model_path(path_or_name, default_repo="smares/ESMCBA"):
if os.path.isfile(path_or_name):
return path_or_name
if path_or_name.startswith("hf://"):
# format: hf://<repo_id>/<filename>
_, _, repo_id, filename = path_or_name.split("/", 3)
return hf_hub_download(repo_id=repo_id, filename=filename, repo_type="model")
return hf_hub_download(repo_id=default_repo, filename=path_or_name, repo_type="model")
Then in your loader:
import torch
ckpt_path = resolve_model_path(args.model_path)
state = torch.load(ckpt_path, map_location="cpu") # change to cuda as needed
7. Common tasks
List files available in the model repo:
git lfs ls-files | cat # if you cloned the HF repoor browse the Hub page.
Download all files for offline use:
hf download smares/ESMCBA --repo-type model --local-dir ./modelsKeep outputs tidy:
mkdir -p ./outputs
8. Troubleshooting
huggingface-cli download is deprecated
Usehf downloadinstead.Permission or quota errors when downloading
Public models do not require login. For private models, runhf login.Slow transfers
Installhf_transferand exportHF_HUB_ENABLE_HF_TRANSFER=1.File not found
Double check the exact filename on the Hub. Filenames are long. Copy and paste.
9. Models
| HLA | Model checkpoint |
|---|---|
| B5101 | ESMCBA_epitope_0.5_20_ESMMASK_epitope_FT_15_0.0001_1e-05_AUG_6_HLAB5101_5_0.001_1e-06__3_B5101_Hubber_B5101_final.pth |
| A0206 | ESMCBA_epitope_0.5_20_ESMMASK_epitope_FT_25_0.0001_1e-06_AUG_1_HLAA0206_2_0.001_1e-06__1_A0206_Hubber_A0206_final.pth |
| B3701 | ESMCBA_epitope_0.5_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_3_HLAB3701_1_0.0001_1e-05__1_B3701_0404_Hubber_B3701_final.pth |
| B5301 | ESMCBA_epitope_0.5_30_ESMMASK_epitope_FT_15_0.001_1e-06_AUG_6_HLAB5301_1_0.0001_1e-05__1_B5301_0404_Hubber_B5301_final.pth |
| A2402 | ESMCBA_epitope_0.5_30_ESMMASK_epitope_FT_20_0.001_1e-06_AUG_1_HLAA2402_1_0.0001_1e-06__2_A2402_0404_Hubber_A2402_final.pth |
| C0802 | ESMCBA_epitope_0.5_30_ESMMASK_epitope_FT_20_0.001_5e-05_AUG_1_HLAC0802_2_0.0001_1e-05__2_C0802_0404_Hubber_C0802_final.pth |
| A0301 | ESMCBA_epitope_0.5_30_ESMMASK_epitope_FT_25_0.001_0.001_AUG_1_HLAA0301_1_0.001_1e-06__1_A0301_Hubber_A0301_final.pth |
| B3501 | ESMCBA_epitope_0.5_30_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_6_HLAB3501_2_0.001_0.001__4_B3501_Hubber_B3501_final.pth |
| C1502 | ESMCBA_epitope_0.5_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAC1502_2_0.0001_1e-06__1_C1502_0404_Hubber_C1502_final.pth |
| B4601 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_15_0.001_1e-06_AUG_6_HLAB4601_1_0.0001_1e-05__2_B4601_0404_Hubber_B4601_final.pth |
| C0501 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_15_0.001_1e-06_AUG_6_HLAC0501_2_0.0001_1e-06__2_C0501_0404_Hubber_C0501_final.pth |
| A3201 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_15_0.001_5e-05_AUG_1_HLAA3201_2_0.0001_1e-06__1_A3201_0404_Hubber_A3201_final.pth |
| A0205 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_15_0.001_5e-05_AUG_3_HLAA0205_2_0.0001_1e-06__2_A0205_0404_Hubber_A0205_final.pth |
| A3001 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_25_0.0001_1e-06_AUG_3_HLAA3001_4_0.0001_0.001__3_A3001_Hubber_A3001_final.pth |
| A0101 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_25_0.001_1e-05_AUG_6_HLAA0101_2_0.001_0.001__3_A0101_Hubber_A0101_final.pth |
| C1203 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_1_HLAC1203_1_0.0001_1e-05__2_C1203_0404_Hubber_C1203_final.pth |
| A0207 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_3_HLAA0207_1_0.0001_1e-06__2_A0207_0404_Hubber_A0207_final.pth |
| A0211 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_6_HLAA0211_2_0.0001_1e-06__1_A0211_0404_Hubber_A0211_final.pth |
| B5801 | ESMCBA_epitope_0.8_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_6_HLAB5801_2_0.0001_1e-06__2_B5801_0404_Hubber_B5801_final.pth |
| B0702 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_15_0.0001_0.001_AUG_6_HLAB0702_3_0.001_1e-06__4_B0702_Hubber_B0702_final.pth |
| C0701 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_15_0.001_5e-05_AUG_1_HLAC0701_2_0.0001_1e-05__1_C0701_0404_Hubber_C0701_final.pth |
| B3801 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_20_0.001_1e-06_AUG_3_HLAB3801_2_0.0001_1e-06__1_B3801_0404_Hubber_B3801_final.pth |
| C0303 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_20_0.001_1e-06_AUG_3_HLAC0303_1_0.0001_1e-05__2_C0303_0404_Hubber_C0303_final.pth |
| B4501 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_1_HLAB4501_2_0.0001_1e-05__2_B4501_0404_Hubber_B4501_final.pth |
| B4001 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_6_HLAB4001_1_0.0001_1e-06__2_B4001_0404_Hubber_B4001_final.pth |
| A0201 | ESMCBA_epitope_0.8_30_ESMMASK_epitope_FT_5_0.001_1e-06_AUG_6_HLAA0201_2_0.001_1e-06__2_A0201_Hubber_A0201_final.pth |
| C0602 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_15_0.001_5e-05_AUG_1_HLAC0602_2_0.0001_1e-06__1_C0602_0404_Hubber_C0602_final.pth |
| A2501 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_20_0.001_1e-06_AUG_1_HLAA2501_1_0.0001_1e-06__1_A2501_0404_Hubber_A2501_final.pth |
| B5401 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_20_0.001_5e-05_AUG_1_HLAB5401_2_0.0001_1e-06__2_B5401_0404_Hubber_B5401_final.pth |
| A1101 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.0001_1e-05_AUG_3_HLAA1101_5_0.001_1e-06__2_A1101_Hubber_A1101_final.pth |
| B1801 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.0001_1e-05_AUG_6_HLAB1801_1_0.001_1e-06__4_B1801_Hubber_B1801_final.pth |
| B1501 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.001_0.001_AUG_3_HLAB1501_2_0.001_0.001__2_B1501_Hubber_B1501_final.pth |
| A6801 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.001_1e-05_AUG_1_HLAA6801_2_0.0001_1e-06__4_A6801_Hubber_A6801_final.pth |
| B2705 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_3_HLAB2705_2_0.0001_1e-06__2_B2705_0404_Hubber_B2705_final.pth |
| C0401 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_3_HLAC0401_2_0.0001_1e-06__1_C0401_0404_Hubber_C0401_final.pth |
| B1502 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAB1502_1_1e-05_1e-05__1_B1502_0404_Hubber_B1502_final.pth |
| A0202 | ESMCBA_epitope_0.95_20_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_6_HLAA0202_1_0.0001_1e-05__2_A0202_0404_Hubber_A0202_final.pth |
| A2601 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_15_0.0001_1e-05_AUG_1_HLAA2601_5_0.001_0.001__4_A2601_Hubber_A2601_final.pth |
| C0702 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_15_0.001_5e-05_AUG_1_HLAC0702_1_0.0001_1e-05__1_C0702_0404_Hubber_C0702_final.pth |
| A3301 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_20_0.001_0.001_AUG_1_HLAA3301_5_0.001_1e-06__4_A3301_Hubber_A3301_final.pth |
| B0801 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_20_0.001_1e-06_AUG_1_HLAB0801_1_0.0001_1e-06__1_B0801_0404_Hubber_B0801_final.pth |
| B1517 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_20_0.001_5e-05_AUG_3_HLAB1517_1_0.0001_1e-05__2_B1517_0404_Hubber_B1517_final.pth |
| A0203 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_0.001_AUG_6_HLAA0203_2_0.001_0.001__2_A0203_Hubber_A0203_final.pth |
| B5701 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_1e-05_AUG_1_HLAB5701_2_0.0001_1e-05__1_B5701_Hubber_B5701_final.pth |
| B4402 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_1e-05_AUG_3_HLAB4402_1_0.001_0.001__2_B4402_Hubber_B4402_final.pth |
| A6802 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_6_HLAA6802_2_0.001_1e-06__4_A6802_Hubber_A6802_final.pth |
| B4403 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_3_HLAB4403_1_0.0001_1e-06__1_B4403_0404_Hubber_B4403_final.pth |
| C1402 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_3_HLAC1402_1_0.0001_1e-06__1_C1402_0404_Hubber_C1402_final.pth |
| B4002 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_1e-06_AUG_6_HLAB4002_2_0.0001_1e-05__1_B4002_0404_Hubber_B4002_final.pth |
| A3101 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAA3101_2_0.0001_1e-06__2_A3101_0404_Hubber_A3101_final.pth |
| B1402 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_3_HLAB1402_2_1e-05_1e-06__1_B1402_0404_Hubber_B1402_final.pth |
| B1503 | ESMCBA_epitope_0.95_30_ESMMASK_epitope_FT_25_0.001_5e-05_AUG_6_HLAB1503_2_0.0001_1e-05__2_B1503_0404_Hubber_B1503_final.pth |
10. Repro tip for papers and reviews
Record the exact commit of the code and the model snapshot. Example:
Code commit: <git SHA from ESMCBA repo>
Model snapshot: <commit SHA shown in HF snapshots path>
HLA: B5101
Encoding: epitope
11. License and citation
Follow the license in the GitHub repo for code and the model card in the Hub repo for weights. If you use ESMCBA in research, please cite the associated manuscript or submission.
- Downloads last month
- 13
Model tree for smares/ESMCBA
Base model
EvolutionaryScale/esmc-300m-2024-12
