Soumye Singhal's picture

Soumye Singhal

soumye

·

AI & ML interests

LLM Post-training

Recent Activity

new activity about 10 hours ago

nvidia/Llama-3_3-Nemotron-Super-49B-v1_5:Tool calling no stream

liked a model 4 days ago

nvidia/Llama-3_3-Nemotron-Super-49B-v1_5

upvoted an article 11 days ago

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

View all activity

Organizations

upvoted an article 11 days ago

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

By

and 3 others •

11 days ago

• 47

upvoted a collection 27 days ago

Reward Models

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 8 days ago • 16

upvoted an article about 1 month ago

Article

Supercharge Edge AI with High Accuracy Reasoning Using Llama Nemotron Nano 4B

By

and 3 others •

Jun 10

• 7

upvoted a paper 3 months ago

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

Paper • 2504.16891 • Published Apr 23 • 24

upvoted 3 collections 3 months ago

OpenMathReasoning

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 8 days ago • 42

RL+reason model

212 items • Updated 2 days ago • 14

Fav-papers

32 items • Updated 8 days ago • 3

upvoted 3 papers 3 months ago

Countering Language Drift with Seeded Iterated Learning

Paper • 2003.12694 • Published Mar 28, 2020 • 1

Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2 • 40

Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment

Paper • 2502.00203 • Published Jan 31 • 2

upvoted a paper 4 months ago

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Paper • 2504.03624 • Published Apr 4 • 13

upvoted 3 collections 4 months ago

Nemotron-H

Mamba-Transformer hybrid models • 10 items • Updated 8 days ago • 29

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 8 days ago • 61

Llama Nemotron

Open, Production-ready Enterprise Models • 9 items • Updated 4 days ago • 62

upvoted a paper about 1 year ago

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

Paper • 2405.01481 • Published May 2, 2024 • 31