Smoliakov's picture

Smoliakov PRO

Yehor

·

https://t.me/doing_something

AI & ML interests

Speech-to-Text, Text-to-Speech, Voice over Internet Protocol

Recent Activity

liked a Space 6 days ago

Xenova/remove-background-web

liked a Space 6 days ago

not-lain/background-removal

liked a Space 7 days ago

tararad/Liketropy-LLM-Detector

View all activity

Organizations

upvoted an article 11 days ago

Article

Test-Driving the LLMD Inference Engine by ZML 🚀

By

•

12 days ago

• 21

upvoted an article 17 days ago

Article

Automated Discovery of High-Performance GPU Kernels with OpenEvolve

By

•

Jun 27

• 21

upvoted a collection 17 days ago

H-Net

The family of hierarchical networks (H-Nets) from https://arxiv.org/abs/2507.07955 • 8 items • Updated 20 days ago • 18

upvoted a collection about 1 month ago

OmniGEC

This is a collection of multilingual silver-standard datasets and models for the task of Grammatical Error Correction (GEC). • 8 items • Updated Apr 26 • 8

upvoted an article 4 months ago

Article

Boost Wav2Vec2 with n-gram LM in 🤗 Transformers

By

•

Jan 12, 2022

• 12

upvoted an article 5 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

By

and 3 others •

Mar 12

• 447

upvoted 5 collections 5 months ago

Gemma 3

All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 50 items • Updated about 19 hours ago • 73

MT Quality Estimation

Models for reference-free quality estimation of machine translation • 10 items • Updated Jan 29 • 4

GTE models

General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 21 items • Updated Jan 21 • 30

Ukrainian Speech-to-Text models

4 items • Updated Jun 4 • 1

OWLS: Scaling Laws for Speech Recognition and Translation

🦉 A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data. 16k sampling rate. • 8 items • Updated May 3 • 7

upvoted an article 5 months ago

Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

By

and 1 other •

Feb 11

• 31

upvoted 3 collections 5 months ago

NeMo Curator - Classifier Models

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated 9 days ago • 19

Ukrainian Text-to-Speech datasets

Five voices: Mykyta, Oleksa, Lada, Kateryna or Tetiana • 6 items • Updated Feb 26 • 4

Crimean Tatar Text-to-Speech datasets

Three voices: Abibullah, Sevil, or Arslan • 4 items • Updated May 27 • 2

upvoted a paper 6 months ago

Setting up the Data Printer with Improved English to Ukrainian Machine Translation

Paper • 2404.15196 • Published Apr 23, 2024 • 1

upvoted a paper over 1 year ago

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition

Paper • 2310.06434 • Published Oct 10, 2023 • 4

upvoted 2 papers almost 2 years ago

Retrieval-Augmented Text-to-Audio Generation

Paper • 2309.08051 • Published Sep 14, 2023 • 7

AudioSR: Versatile Audio Super-resolution at Scale

Paper • 2309.07314 • Published Sep 13, 2023 • 28