A lightweight hallucination detector that uses sparse autoencoders to identify, explain, and mitigate unfaithful RAG outputs
Guangzhi Xiong
gzxiong
·
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution upvoted a collection about 1 month ago
RAGLens updated a model about 1 month ago
gzxiong/sae-qwen3-4b