shawn24
/

Ladder

Model card Files Files and versions

Metrics Training metrics Community

Ladder / README.md

shawn24's picture

Update README.md

93b794d verified 13 days ago

|

history blame contribute delete

2.7 kB

	---
	license: mit
	language:
	- en
	---
	# 🪜 LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers

	[![Project](https://img.shields.io/badge/Project-%23dfb317)](https://shantanu-ai.github.io/projects/ACL-2025-Ladder/index.html)
	[![Paper](https://img.shields.io/badge/Paper-ACL%202025-%23dfb317)](https://aclanthology.org/2025.findings-acl.1177/)
	[![Code](https://img.shields.io/badge/GitHub-batmanlab%2FLADDER-%2312100e)](https://github.com/batmanlab/Ladder)
	[![Model](https://img.shields.io/badge/HuggingFace-Pretrained--Checkpoints-blue)](https://huggingface.co/shawn24/Ladder/tree/main)

	---

	## 📌 Summary

	LADDER is a general framework that enables vision classifiers to automatically discover subpopulations (or "slices") of data where the model is underperforming — without requiring group annotations. It leverages vision-language representations and the reasoning capabilities of large language models (LLMs) to detect and rectify bias-inducing features in both natural and medical imaging domains.

	---

	## 🧠 Architecture & Components

	- 🔍 Slice Discovery using:
	- CLIP, Mammo-CLIP, and CXR-CLIP features
	- BLIP and GPT-4o-generated captions
	- 🧠 Hypothesis Generation using:
	- GPT-4o, Claude, Gemini, LLaMA
	- ✅ Bias Mitigation via reweighting & pseudo-labeling

	---

	## 📊 Datasets Used

	- Natural Images: Waterbirds, CelebA, MetaShift
	- Medical Images: NIH ChestX-ray, RSNA Mammograms, VinDr Mammograms

	---

	## 📦 Files Included

	\| File \| Description \|
	\|------\|-------------\|
	\| `model.pt` \| Pretrained model checkpoint \|
	\| `feature_cache.pkl` \| Cached representations (CLIP/Mammo-CLIP/CXR-CLIP) \|
	\| `metadata.csv` \| Metadata with discovered slice labels \|
	\| `caption_blip.json` \| BLIP-generated captions \|
	\| `caption_gpt4o.json` \| GPT-4o-generated captions \|
	\| `predictions.json` \| Model predictions on test set \|

	---


	## 🧪 Benchmarks

	LADDER outperforms traditional slice discovery methods (Domino, FACTS) across 6 datasets and >200 classifiers. It is especially effective in:

	- Discovering hidden biases without explicit attribute labels
	- Reasoning about non-visual factors (e.g., preprocessing artifacts)
	- Operating without human-written captions

	---

	## 📜 Citation

	```bibtex
	@article{ghosh2024ladder,
	title={LADDER: Language Driven Slice Discovery and Error Rectification},
	author={Ghosh, Shantanu and Syed, Rayan and Wang, Chenyu and Poynton, Clare B and Visweswaran, Shyam and Batmanghelich, Kayhan},
	journal={arXiv preprint arXiv:2408.07832},
	year={2024}
	}
	```

	---

	## 🤝 Acknowledgements

	Boston University, Stanford University, BUMC, and the University of Pittsburgh.