Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Steven Basart's picture
15 3 7

Steven Basart

xksteven
thomwolf's profile picture laloadrianmorales's profile picture 21world's profile picture
·
http://stevenbas.art
  • xksteven

AI & ML interests

None yet

Organizations

Center for AI Safety Backup Account's profile picture BigCode's profile picture Center for AI Safety's profile picture July AI's profile picture Hugging Face Skills's profile picture

upvoted a paper 8 months ago

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Paper • 2402.04249 • Published Feb 6, 2024 • 6
upvoted 2 papers over 1 year ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 103

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark

Paper • 2304.03279 • Published Apr 6, 2023 • 2
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs