Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HuggingFaceFW 's Collections
πŸ₯‚ FineWeb2
🍷 FineWeb
πŸ“š FineWeb-Edu
πŸ“€ Dataset comparison models
πŸ§ͺ FineWeb v1 data experiments

πŸ₯‚ FineWeb2

updated Jun 27
Upvote
20

  • FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

    Paper β€’ 2506.20920 β€’ Published Jun 26 β€’ 63

  • HuggingFaceFW/fineweb-2

    Viewer β€’ Updated Jun 27 β€’ 5.02B β€’ 668k β€’ 601

  • Running
    68
    68

    Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

    πŸ“

    Evaluate multilingual models using FineTasks

Upvote
20
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs