Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates Paper • 2410.07137 • Published Oct 9, 2024 • 8
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published Jul 1, 2024 • 41
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses Paper • 2406.01288 • Published Jun 3, 2024 • 1
Intriguing Properties of Data Attribution on Diffusion Models Paper • 2311.00500 • Published Nov 1, 2023 • 2
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast Paper • 2402.08567 • Published Feb 13, 2024 • 2