hannahcyberey's picture
update readme
a3e4052

A newer version of the Gradio SDK is available: 5.49.1

Upgrade
metadata
title: DeepSeek-R1 Censorship Steering
emoji: 🐳
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.24.0
app_file: app.py
pinned: false

This is a demo for Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control

@article{cyberey2025steering,
    title={Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control}, 
    author={Hannah Cyberey and David Evans},
    year={2025},
    eprint={2504.17130},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2504.17130}, 
}