4 28 15

DeyangKong

DeyangKong

AI & ML interests

Natural Language Processing

Recent Activity

authored a paper 6 days ago

LongCat-Flash Technical Report

authored a paper 6 days ago

Autoformalizer with Tool Feedback

authored a paper 6 days ago

OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration

View all activity

Organizations

upvoted 2 papers 6 days ago

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Paper • 2602.08676 • Published 8 days ago • 65

OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration

Paper • 2602.08344 • Published 8 days ago • 5

upvoted a paper 22 days ago

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published 25 days ago • 175

upvoted a paper about 1 month ago

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

Paper • 2601.10156 • Published Jan 15 • 26

upvoted a paper about 2 months ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 64

upvoted a paper 2 months ago

DEER: Draft with Diffusion, Verify with Autoregressive Models

Paper • 2512.15176 • Published Dec 17, 2025 • 44

upvoted a paper 8 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

upvoted 2 papers 9 months ago

Skywork Open Reasoner 1 Technical Report

Paper • 2505.22312 • Published May 28, 2025 • 54

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23, 2025 • 6

upvoted a paper 10 months ago

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published Apr 7, 2025 • 11

upvoted 3 papers 11 months ago

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published Mar 24, 2025 • 31

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20, 2025 • 77

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144

upvoted 2 papers 12 months ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3, 2025 • 10

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Paper • 2502.07374 • Published Feb 11, 2025 • 40

upvoted a paper over 1 year ago

Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28, 2024 • 42

upvoted a collection over 1 year ago

Code Evaluation

Collection

Collection of Papers on Code Evaluation (from code generation language models) • 45 items • Updated Oct 29, 2024 • 16

upvoted 3 papers over 1 year ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 168

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11, 2024 • 40

DeyangKong

AI & ML interests

Recent Activity

Organizations

DeyangKong's activity