Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Ville Komulainen's picture
1 7 1

Ville Komulainen

Villekom
hannamarikupari's profile picture kjoona's profile picture poesia's profile picture
·
  • Vmjkom

AI & ML interests

NLP, text generation, semantic analysis

Recent Activity

updated a collection 8 days ago
open-sci-ref-0.01 HPLT-2.0
updated a collection 8 days ago
open-sci-ref-0.01 CommonCorpus
updated a collection 8 days ago
open-sci-ref-0.01 HPLT-2.0
View all activity

Organizations

TurkuNLP Research Group's profile picture HPLT's profile picture LumiOpen's profile picture Open-ψ (Open-Sci) Collective's profile picture OpenEuroLLM's profile picture

upvoted 2 papers about 1 month ago

Got Compute, but No Data: Lessons From Post-training a Finnish LLM

Paper • 2503.09407 • Published Mar 12 • 1

An Expanded Massive Multilingual Dataset for High-Performance Language Technologies

Paper • 2503.10267 • Published Mar 13 • 1
upvoted 3 papers 6 months ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 64

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Paper • 2502.01534 • Published Feb 3 • 41

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 241
upvoted 2 papers over 1 year ago

Poro 34B and the Blessing of Multilinguality

Paper • 2404.01856 • Published Apr 2, 2024 • 16

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 21
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs