Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
bluelightai-dev 's Collections
Sampled Datasets

Sampled Datasets

updated 28 days ago

Random samples from large datasets, for convenience.

Upvote
-

  • bluelightai-dev/dclm-full-deduped-sample

    Viewer • Updated 29 days ago • 4.92M • 133

  • bluelightai-dev/the-stack-dedup-sample

    Viewer • Updated 29 days ago • 474k • 52

  • bluelightai-dev/common-corpus-sample-open-culture

    Viewer • Updated 29 days ago • 462k • 57

  • bluelightai-dev/common-corpus-sample-open-government

    Viewer • Updated 29 days ago • 373k • 74 • 1

  • bluelightai-dev/common-corpus-sample-open-science

    Viewer • Updated 29 days ago • 284k • 58

  • bluelightai-dev/common-corpus-sample-open-source

    Viewer • Updated 29 days ago • 2.02M • 54

  • bluelightai-dev/common-corpus-sample-open-web

    Viewer • Updated 29 days ago • 4.8M • 88

  • bluelightai-dev/MathPile_Commercial-formatted

    Viewer • Updated 28 days ago • 389k • 103
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs