13 15 19

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

liked a Space about 2 months ago

HuggingFaceTB/smol-training-playbook

liked a dataset 4 months ago

HuggingFaceM4/FineVision

liked a model 4 months ago

google/embeddinggemma-300m

View all activity

Organizations

liked a Space about 2 months ago

The Smol Training Playbook

📚

2.68k

The secrets to building world-class LLMs

liked a dataset 4 months ago

HuggingFaceM4/FineVision

Viewer • Updated Oct 21 • 24.2M • 119k • 462

liked a model 4 months ago

google/embeddinggemma-300m

liked a dataset 4 months ago

nvidia/Granary

Viewer • Updated Aug 14 • 116M • 4.82k • 163

upvoted 2 papers 6 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 75

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

Paper • 2506.00338 • Published May 31 • 10

upvoted a changelog 7 months ago

Changelog

Xet is now the default storage option for new users and organizations

May 23

• 74

liked a Space 8 months ago

Dia 1.6B

👯

1.73k

Generate realistic dialogue from a script, using Dia!

upvoted a collection 9 months ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29 • 673

upvoted an article 9 months ago

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

Mar 24

•

upvoted an article 10 months ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25

•

172

liked a Space 10 months ago

The Ultra-Scale Playbook

🌌

3.6k

The ultimate guide to training LLM on large GPU Clusters

upvoted 3 articles 11 months ago

Article

1 Billion Classifications

Feb 13

•

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

Jan 30

•

201

Article

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

Jan 29

•

liked a model 11 months ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 666k • • 12.9k

upvoted a paper about 1 year ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

updated a Space about 1 year ago

Number Tokenization Blog

📈

105

Explore how tokenization affects arithmetic in LLMs

liked a dataset about 1 year ago

HuggingFaceFW/fineweb-2

Viewer • Updated Oct 27 • 4.48B • 58.9k • 707

liked a Space about 1 year ago

Number Tokenization Blog

📈

105

Explore how tokenization affects arithmetic in LLMs

Garreth Lee

AI & ML interests

Recent Activity

Organizations

garrethlee's activity

The Smol Training Playbook

Xet is now the default storage option for new users and organizations

Dia 1.6B

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

FastRTC: The Real-Time Communication Library for Python

The Ultra-Scale Playbook

1 Billion Classifications

KV Caching Explained: Optimizing Transformer Inference Efficiency

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

Number Tokenization Blog

Number Tokenization Blog