4 10 1

SihengLi

Siheng99

SihengLi99

AI & ML interests

Artificial Intelligence

Recent Activity

upvoted a paper 25 days ago

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

authored a paper about 1 month ago

Reinforcement Learning on Pre-Training Data

upvoted a paper about 1 month ago

Reinforcement Learning on Pre-Training Data

View all activity

Organizations

Collections 2

Papers 7

models 9

datasets 3

Siheng99/Qwen2.5-14B-Instruct-SEALONG-Dataset

Viewer • Updated Nov 10, 2024 • 2.05k • 20 • 1

Siheng99/Qwen2.5-7B-Instruct-SEALONG-Dataset

Viewer • Updated Nov 10, 2024 • 2.05k • 37 • 1

Siheng99/Llama-3.1-8B-Instruct-SEALONG-Dataset

Viewer • Updated Nov 10, 2024 • 2.05k • 7 • 1

SihengLi

AI & ML interests

Recent Activity

Organizations

Collections 2

Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-RePO

Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-RePO

Large Language Models Can Self-Improve in Long-context Reasoning

Siheng99/Llama-3.1-8B-Instruct-SEALONG

Siheng99/Qwen2.5-7B-Instruct-SEALONG

Siheng99/Qwen2.5-14B-Instruct-SEALONG

Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-RePO

Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-RePO

Large Language Models Can Self-Improve in Long-context Reasoning

Siheng99/Llama-3.1-8B-Instruct-SEALONG

Siheng99/Qwen2.5-7B-Instruct-SEALONG

Siheng99/Qwen2.5-14B-Instruct-SEALONG

Papers 7

models 9

Siheng99/Qwen3-1.7B-DeepMath-1024samples-RePO

Siheng99/Qwen3-1.7B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-RePO

Siheng99/Qwen2.5-Math-7B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-RePO

Siheng99/Qwen2.5-Math-1.5B-DeepMath-1024samples-GRPO

Siheng99/Qwen2.5-14B-Instruct-SEALONG

Siheng99/Qwen2.5-7B-Instruct-SEALONG

Siheng99/Llama-3.1-8B-Instruct-SEALONG

datasets 3

Siheng99/Qwen2.5-14B-Instruct-SEALONG-Dataset

Siheng99/Qwen2.5-7B-Instruct-SEALONG-Dataset

Siheng99/Llama-3.1-8B-Instruct-SEALONG-Dataset

SihengLi

AI & ML interests

Recent Activity

Organizations

Collections 2

Papers 7

models 9 Sort: Recently updated

datasets 3 Sort: Recently updated

models 9

datasets 3