TRL

https://github.com/huggingface/trl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

qgallouedec updated a model 20 minutes ago

trl-lib/Qwen3-4B-LoRA

qgallouedec published a model 28 minutes ago

trl-lib/Qwen3-4B-LoRA

qgallouedec updated a dataset 5 days ago

trl-lib/documentation-images

View all activity

qgallouedec

updated a model 20 minutes ago

trl-lib/Qwen3-4B-LoRA

Updated 20 minutes ago

qgallouedec

published a model 28 minutes ago

trl-lib/Qwen3-4B-LoRA

Updated 20 minutes ago

sergiopaniego

posted an update 4 days ago

Post

1066

Yet Another New Multimodal Fine-Tuning Recipe 🥧

🧑‍🍳 In this @HuggingFace Face Cookbook notebook, we demonstrate how to align a multimodal model (VLM) using Mixed Preference Optimization (MPO) using trl.

💡 This recipe is powered by the new MPO support in trl, enabled through a recent upgrade to the DPO trainer!

We align the multimodal model using multiple optimization objectives (losses), guided by a preference dataset (chosen vs. rejected multimodal pairs).

Check it out! ➡️ https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo

1 reply

qgallouedec

updated a dataset 5 days ago

trl-lib/documentation-images

Viewer • Updated 5 days ago • 9 • 97.3k

qgallouedec

updated a Space 5 days ago

Trackio

🚀

Visualize project metrics with Trackio Dashboard

qgallouedec

published a Space 5 days ago

Trackio

🚀

Visualize project metrics with Trackio Dashboard

sergiopaniego

posted an update 10 days ago

Post

1613

🧑‍🍳 New Multimodal Fine-Tuning Recipe 🧑‍🍳

⚡️ In this new @huggingface Cookbook recipe, I walk you though the process of fine tuning a Visual Language Model (VLM) for Object Detection with Visual Grounding, using TRL.

🔍 Object detection typically involves detecting categories in images (e.g., vase).

By combining it with visual grounding, we add contextual understanding so instead of detecting just "vase", we can detect "middle vase" in an image.

VLMs are super powerful!

In this case, I use PaliGemma 2 which already supports object detection and extend it to also add visual grounding.

🤗 Check it out here: https://huggingface.co/learn/cookbook/fine_tuning_vlm_object_detection_grounding

sergiopaniego

posted an update 10 days ago

Post

1587

Multiple NEW notebooks and scripts added to the Hugging Face Gemma recipes repo!

Thanks to the community 🫶, we're adding more and more recipes using Gemma 💎

Fine tuning for all modalities, function calling, RAG...

Repo: https://github.com/huggingface/huggingface-gemma-recipes

We're also open to new ideas from the community 🤗!

1 reply

sergiopaniego

posted an update 13 days ago

Post

371

Loved this paper! ♥️

Authors benchmark multimodal models on vision tasks (detection, segmentation...) using clever prompting tricks.

📄 Results: VLMs are solid generalists but still lag behind SOTA task-specific models — especially on geometric tasks vs. semantic ones.

paper: How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks (2507.01955)

sergiopaniego

posted an update 13 days ago

Post

247

You can already play with two of the latest most impressive models on HF via @novita-ai as Inference Provider 🚨

🌌 Kimi K2: 1T params model, MoE beast for coding, reasoning and agentic tasks
🔮 GLM-4.1V-9B-Thinking: VLM + deep reasoning model

Kimi K2: moonshotai/Kimi-K2-Instruct
GLM-4.1V-9B-Thinking: THUDM/GLM-4.1V-9B-Thinking

sergiopaniego

posted an update 14 days ago

Post

213

Over 1K already on @huggingface !!

qgallouedec

updated a Space 17 days ago

Dataset Length Profiler

👁

Estimate optimal max_length for SFT training

sergiopaniego

posted an update 19 days ago

Post

1566

Test SmolLM3, the newest fully open model released by @HuggingFaceTB !

It's smol (3B), multilingual (6 languages), comes with dual mode reasoning (think/no_think modes) and supports long-context (128k).

Try it now in the notebook below!! ⬇️

Colab notebook: https://colab.research.google.com/github/sergiopaniego/samples/blob/main/smollm3_3b_inference.ipynb
notebook: https://github.com/sergiopaniego/samples/blob/main/smollm3_3b_inference.ipynb
blog: https://huggingface.co/blog/smollm3

sergiopaniego

posted an update 25 days ago

Post

1980

Updated my HF Space for vibe testing smol VLMs on object detection, visual grounding, keypoint detection & counting! 👓

🆕 Compare Qwen2.5 VL 3B vs Moondream 2B side-by-side with annotated images & text outputs.

Try examples or test your own images! 🏃

📱Space: sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update 28 days ago

Post

1053

📣 CALL FOR CONTRIBUTORS! 📣

Following last week’s full release of Gemma 3n, we launched a dedicated recipes repo to explore and share use cases. We already added some! 🧑‍🍳

Now we’re inviting the community to contribute and showcase how these models shine! ✨

Let them cook.

Check it out: https://github.com/huggingface/huggingface-gemma-recipes/issues/4

1 reply

lvwerra

authored a paper about 1 month ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 63

sergiopaniego

posted an update about 1 month ago

Post

464

One of my favorite perks of the Hugging Face Pro plan: ✨Dev mode✨

Connect your HF Space to VS Code and just build — with hot reload out of the box.

Game changer for fast prototyping. 💻

Google Colab made AI accessible. Now HF Spaces are doing it too! 😍

💡 New Hugging Face pricing: http://hf.co/pricing
💡 More details: https://huggingface.co/learn/cookbook/en/enterprise_cookbook_dev_spaces