AI & ML interests

None defined yet.

Recent Activity

qgallouedec  updated a model 20 minutes ago
trl-lib/Qwen3-4B-LoRA
qgallouedec  published a model 28 minutes ago
trl-lib/Qwen3-4B-LoRA
qgallouedec  updated a dataset 5 days ago
trl-lib/documentation-images
View all activity

sergiopaniego 
posted an update 4 days ago
view post
Post
1066
Yet Another New Multimodal Fine-Tuning Recipe 🥧

🧑‍🍳 In this @HuggingFace Face Cookbook notebook, we demonstrate how to align a multimodal model (VLM) using Mixed Preference Optimization (MPO) using trl.

💡 This recipe is powered by the new MPO support in trl, enabled through a recent upgrade to the DPO trainer!

We align the multimodal model using multiple optimization objectives (losses), guided by a preference dataset (chosen vs. rejected multimodal pairs).

Check it out! ➡️ https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo
  • 1 reply
·
sergiopaniego 
posted an update 10 days ago
view post
Post
1613
🧑‍🍳 New Multimodal Fine-Tuning Recipe 🧑‍🍳

⚡️ In this new @huggingface Cookbook recipe, I walk you though the process of fine tuning a Visual Language Model (VLM) for Object Detection with Visual Grounding, using TRL.

🔍 Object detection typically involves detecting categories in images (e.g., vase).

By combining it with visual grounding, we add contextual understanding so instead of detecting just "vase", we can detect "middle vase" in an image.

VLMs are super powerful!

In this case, I use PaliGemma 2 which already supports object detection and extend it to also add visual grounding.

🤗 Check it out here: https://huggingface.co/learn/cookbook/fine_tuning_vlm_object_detection_grounding
sergiopaniego 
posted an update 10 days ago
view post
Post
1587
Multiple NEW notebooks and scripts added to the Hugging Face Gemma recipes repo!

Thanks to the community 🫶, we're adding more and more recipes using Gemma 💎

Fine tuning for all modalities, function calling, RAG...

Repo: https://github.com/huggingface/huggingface-gemma-recipes

We're also open to new ideas from the community 🤗!
  • 1 reply
·
sergiopaniego 
posted an update 13 days ago
sergiopaniego 
posted an update 13 days ago
view post
Post
247
You can already play with two of the latest most impressive models on HF via @novita-ai as Inference Provider 🚨

🌌 Kimi K2: 1T params model, MoE beast for coding, reasoning and agentic tasks
🔮 GLM-4.1V-9B-Thinking: VLM + deep reasoning model

Kimi K2: moonshotai/Kimi-K2-Instruct
GLM-4.1V-9B-Thinking: THUDM/GLM-4.1V-9B-Thinking
sergiopaniego 
posted an update 14 days ago
sergiopaniego 
posted an update 19 days ago
view post
Post
1566
Test SmolLM3, the newest fully open model released by @HuggingFaceTB !

It's smol (3B), multilingual (6 languages), comes with dual mode reasoning (think/no_think modes) and supports long-context (128k).

Try it now in the notebook below!! ⬇️

Colab notebook: https://colab.research.google.com/github/sergiopaniego/samples/blob/main/smollm3_3b_inference.ipynb
notebook: https://github.com/sergiopaniego/samples/blob/main/smollm3_3b_inference.ipynb
blog: https://huggingface.co/blog/smollm3
sergiopaniego 
posted an update 25 days ago
view post
Post
1980
Updated my HF Space for vibe testing smol VLMs on object detection, visual grounding, keypoint detection & counting! 👓

🆕 Compare Qwen2.5 VL 3B vs Moondream 2B side-by-side with annotated images & text outputs.

Try examples or test your own images! 🏃

📱Space: sergiopaniego/vlm_object_understanding
sergiopaniego 
posted an update 28 days ago
view post
Post
1053
📣 CALL FOR CONTRIBUTORS! 📣

Following last week’s full release of Gemma 3n, we launched a dedicated recipes repo to explore and share use cases. We already added some! 🧑‍🍳

Now we’re inviting the community to contribute and showcase how these models shine! ✨

Let them cook.

Check it out: https://github.com/huggingface/huggingface-gemma-recipes/issues/4
  • 1 reply
·
sergiopaniego 
posted an update about 1 month ago