Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
Marc Lammers
PRO
MarcusLammers
Follow
0 followers
·
1 following
https://www.augustus.cloud
LammersMarcus
mlmarclammers
AI & ML interests
The future of compute isn’t linear, it is intelligent.
Recent Activity
commented
on
an
article
1 day ago
Introducing Wikipedia Monthly: Fresh, Clean Wikipedia Dumps for NLP & AI Research
replied
to
omarkamali
's
post
1 day ago
Another month, another Wikipedia Monthly release! 🎃 Highlights of October's edition: · 🗣️ 341 languages · 📚 64.7M articles (+2.5%) · 📦 89.4GB of data (+3.3%) We are now sampling a random subset of each language with a reservoir sampling method to produce splits `1000`, `5000`, and `10000` in addition to the existing `train` split that contains all the data. Now you can load the english (or your favorite language) subset in seconds: `dataset = load_dataset("omarkamali/wikipedia-monthly", "latest.en", split="10000")` Happy data engineering! 🧰 https://huggingface.co/datasets/omarkamali/wikipedia-monthly
commented
on
an
article
1 day ago
The Next Frontier: Large Language Models In Biology
View all activity
Organizations
Articles
2
Article
1
AXIS
Article
CHARIOT
View all Articles
models
0
None public yet
datasets
0
None public yet