chengzhi
lczazu
AI & ML interests
None yet
Recent Activity
upvoted
an
article
about 1 month ago
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment
upvoted
a
paper
7 months ago
Conditional Quantile Estimation for Uncertain Watch Time in Short-Video
Recommendation
new activity
9 months ago
unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit:Error when load model
Organizations
None yet