Haitham Bou Ammar's picture

Haitham Bou Ammar PRO

hba123

·

AI & ML interests

LLMs, VLMs, Robotics, Reinforcement Learning, Bayesian Optimisation

Recent Activity

commented on a paper 2 days ago

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

reacted to their post with 🔥 4 days ago

Hey, amazing, awesome people of the beautiful internet 😍🥰 Distillation has been (from my point of view) a main driving factor for the success of hashtag#LLMs - like distilling the knowledge of an amazing big model (say hashtag#DeepSeekv3, or hashtag#GeminiAI) into yours. Probably, you have done it with minimising a KL divergence, and it somehow worked. Well, not that well, right? 1️⃣ Your model tends to memorise! 2️⃣ Your model might get the right answer, but its reasoning might be flawed. To fix those problems, we rethink distillation and process a new approach! A method that is based on constrained RL that comes with nice theoretical guarantees and excellent performance! Check it out: https://huggingface.co/papers/2509.22921 Let us do distillation right! Please upvote if you find it useful!

reacted to their post with 🚀 4 days ago

Hey, amazing, awesome people of the beautiful internet 😍🥰 Distillation has been (from my point of view) a main driving factor for the success of hashtag#LLMs - like distilling the knowledge of an amazing big model (say hashtag#DeepSeekv3, or hashtag#GeminiAI) into yours. Probably, you have done it with minimising a KL divergence, and it somehow worked. Well, not that well, right? 1️⃣ Your model tends to memorise! 2️⃣ Your model might get the right answer, but its reasoning might be flawed. To fix those problems, we rethink distillation and process a new approach! A method that is based on constrained RL that comes with nice theoretical guarantees and excellent performance! Check it out: https://huggingface.co/papers/2509.22921 Let us do distillation right! Please upvote if you find it useful!

View all activity

Organizations

None yet

hba123 's datasets

None public yet