AI & ML interests

Security, privacy, and trustworthiness of machine learning systems.

Recent Activity

ethz-spylab 's collections 3

RLHF Trojan Competition
Datasets and models used for the trojan detection competition co-located at SaTML 2024: https://github.com/ethz-spylab/rlhf_trojan_competition