Xiaosen Zheng's picture

Xiaosen Zheng

xszheng2020

·

AI & ML interests

Data-Centric AI and AI Safety.

Recent Activity

upvoted a collection about 18 hours ago

upvoted a paper 2 days ago

Geometric-Mean Policy Optimization

upvoted a paper 2 days ago

Group Sequence Policy Optimization

View all activity

Organizations

upvoted a collection about 18 hours ago

Seed-Coder

4 items • Updated May 13 • 19

upvoted 2 papers 2 days ago

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published 3 days ago • 23

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 7 days ago • 245

upvoted a paper 8 days ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published 8 days ago • 48

upvoted a paper 14 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published 15 days ago • 40

upvoted 2 papers about 1 month ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 64

Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training

Paper • 2506.10952 • Published Jun 12 • 23

upvoted 2 papers about 2 months ago

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Paper • 2505.19641 • Published May 26 • 67

Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning

Paper • 2505.21067 • Published May 27 • 3

upvoted a paper 2 months ago

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22, 2024 • 48

upvoted 2 papers 3 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 147

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 86

upvoted a collection 3 months ago

NoisyRollout

8 items • Updated May 20 • 6

upvoted a paper 3 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92

upvoted a collection 4 months ago

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 11 items • Updated 10 days ago • 83

upvoted a paper 5 months ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 114

upvoted a collection 5 months ago

OLMo 2 Preview Post-trained Models

These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions. • 6 items • Updated Apr 30 • 4

upvoted a paper 6 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 241

upvoted an article 6 months ago

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 305

upvoted a collection 6 months ago

DeepSeek-R1

10 items • Updated May 29 • 769