Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Wenhao Zhan's picture

Wenhao Zhan

whzhan
https://whzhan99.github.io/

AI & ML interests

Reinforcement Learning

Organizations

None yet

authored a paper about 1 year ago

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Paper • 2410.04612 • Published Oct 6, 2024
authored 5 papers over 1 year ago

Provably Efficient CVaR RL in Low-rank MDPs

Paper • 2311.11965 • Published Nov 20, 2023

REBEL: Reinforcement Learning via Regressing Relative Rewards

Paper • 2404.16767 • Published Apr 25, 2024 • 2

Provable Offline Preference-Based Reinforcement Learning

Paper • 2305.14816 • Published May 24, 2023

Provable Reward-Agnostic Preference-Based Reinforcement Learning

Paper • 2305.18505 • Published May 29, 2023

Dataset Reset Policy Optimization for RLHF

Paper • 2404.08495 • Published Apr 12, 2024 • 9
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs