Haitham Bou Ammar PRO
hba123
AI & ML interests
LLMs, VLMs, Robotics, Reinforcement Learning, Bayesian Optimisation
Recent Activity
commented on
a paper
2 days ago
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle
Grandmaster Level
reacted
to
their
post
with ๐ฅ
4 days ago
Hey, amazing, awesome people of the beautiful internet ๐๐ฅฐ
Distillation has been (from my point of view) a main driving factor for the success of hashtag#LLMs - like distilling the knowledge of an amazing big model (say hashtag#DeepSeekv3, or hashtag#GeminiAI) into yours.
Probably, you have done it with minimising a KL divergence, and it somehow worked.
Well, not that well, right?
1๏ธโฃ Your model tends to memorise!
2๏ธโฃ Your model might get the right answer, but its reasoning might be flawed.
To fix those problems, we rethink distillation and process a new approach! A method that is based on constrained RL that comes with nice theoretical guarantees and excellent performance!
Check it out: https://huggingface.co/papers/2509.22921
Let us do distillation right! Please upvote if you find it useful!
reacted
to
their
post
with ๐
4 days ago
Hey, amazing, awesome people of the beautiful internet ๐๐ฅฐ
Distillation has been (from my point of view) a main driving factor for the success of hashtag#LLMs - like distilling the knowledge of an amazing big model (say hashtag#DeepSeekv3, or hashtag#GeminiAI) into yours.
Probably, you have done it with minimising a KL divergence, and it somehow worked.
Well, not that well, right?
1๏ธโฃ Your model tends to memorise!
2๏ธโฃ Your model might get the right answer, but its reasoning might be flawed.
To fix those problems, we rethink distillation and process a new approach! A method that is based on constrained RL that comes with nice theoretical guarantees and excellent performance!
Check it out: https://huggingface.co/papers/2509.22921
Let us do distillation right! Please upvote if you find it useful!
Organizations
None yet