Or Shafran's picture

1 1

Or Shafran

ordavids1

AI & ML interests

None yet

Recent Activity

upvoted a collection 2 days ago

🔍 Interpretability & Analysis of LMs

commented on a paper about 2 months ago

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

authored a paper about 2 months ago

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

View all activity

Organizations

None yet

upvoted a collection 2 days ago

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized • 123 items • Updated 7 days ago • 110

commented a paper about 2 months ago

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Paper • 2506.10920 • Published Jun 12 • 6 •

authored a paper about 2 months ago

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Paper • 2506.10920 • Published Jun 12 • 6