1 8 1

Matthieu Zimmer

MatthieuZ

AI & ML interests

None yet

Recent Activity

authored a paper about 1 month ago

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

upvoted a paper about 1 month ago

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

authored a paper about 1 month ago

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

View all activity

Organizations

authored a paper about 1 month ago

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

Paper • 2509.09284 • Published Sep 11 • 2

upvoted a paper about 1 month ago

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

Paper • 2509.09284 • Published Sep 11 • 2

authored 6 papers about 1 month ago

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

Paper • 2312.14878 • Published Dec 22, 2023 • 15

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Paper • 2406.19741 • Published Jun 28, 2024 • 62

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

Paper • 2509.22921 • Published Sep 26 • 11

upvoted a paper about 1 month ago

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

Paper • 2509.22921 • Published Sep 26 • 11

commented a paper about 1 month ago

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

Paper • 2509.22921 • Published Sep 26 • 11 •

upvoted an article 3 months ago

Article

Bridging the Gap: Making Robotics Feel Like Machine Learning

•

Aug 12

• 12

upvoted a paper 3 months ago

Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory

Paper • 2507.16713 • Published Jul 22 • 21

upvoted an article 4 months ago

Article

Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)

and 2 others •

Jul 13

• 11

published an article 4 months ago

Article

Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)

and 2 others •

Jul 13

• 11

upvoted a paper 4 months ago

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving

Paper • 2507.02726 • Published Jul 3 • 14

upvoted a paper 9 months ago

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Paper • 2502.01208 • Published Feb 3 • 11

updated a model 10 months ago

huawei-noah/MOASpec-Llama-3-8B-Instruct

Updated Jan 7 • 2 • 5

upvoted an article 10 months ago

Article

Accelerating Language Model Inference with Mixture of Attentions

and 1 other •

Jan 7

• 24

published an article 10 months ago

Article

Accelerating Language Model Inference with Mixture of Attentions

and 1 other •

Jan 7

• 24

liked a model 10 months ago

huawei-noah/MOASpec-Llama-3-8B-Instruct

Updated Jan 7 • 2 • 5

Matthieu Zimmer

AI & ML interests

Recent Activity

Organizations

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

Tree-OPO: Off-policy Monte Carlo Tree-Guided Advantage Optimization for Multistep Reasoning

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Mixture of Attentions For Speculative Decoding

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>

Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving

Almost Surely Safe Alignment of Large Language Models at Inference-Time

huawei-noah/MOASpec-Llama-3-8B-Instruct

Accelerating Language Model Inference with Mixture of Attentions

Accelerating Language Model Inference with Mixture of Attentions

huawei-noah/MOASpec-Llama-3-8B-Instruct

Matthieu Zimmer

AI & ML interests

Recent Activity

Organizations

MatthieuZ's activity

<p style="text-align:center;"> Bridging the Gap: Making Robotics Feel Like Machine Learning </p>

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>

<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p>

Accelerating Language Model Inference with Mixture of Attentions

Accelerating Language Model Inference with Mixture of Attentions