Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
chengzhi's picture
3 2 2

chengzhi

lczazu

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment
upvoted a paper 7 months ago
Conditional Quantile Estimation for Uncertain Watch Time in Short-Video Recommendation
new activity 9 months ago
unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit:Error when load model
View all activity

Organizations

None yet

Collections 1

data
  • Qwen/Qwen2.5-0.5B-Instruct

    Text Generation • 0.5B • Updated Sep 25, 2024 • 1.56M • 392
data
  • Qwen/Qwen2.5-0.5B-Instruct

    Text Generation • 0.5B • Updated Sep 25, 2024 • 1.56M • 392

Papers 1

arxiv:2407.12223

models 1

lczazu/ppo-LunarLander-v2

Reinforcement Learning • Updated Apr 25, 2023 • 6

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs