AI & ML interests
Machine translation, Language Models, Document Classification, Sentiment Analysis, Corpora, Speech Recognition and Synthesis, Less-Resourced Languages.
Recent Activity
View all activity
Foundational LLM for Basque
-
Pipeline Analysis for Developing Instruct LLMs in Low-Resource Languages: A Case Study on Basque
Paper • 2412.13922 • Published -
orai-nlp/Llama-eus-8B
Text Generation • 8B • Updated • 55 • • 10 -
orai-nlp/Llama-eus-8B-Instruct-v1
Text Generation • 8B • Updated • 7 -
orai-nlp/Llama-eus-8B-slimOrca_eu
Text Generation • 8B • Updated • 7
ElhBERTeu models trained on downstream NLU tasks
Foundational LLM for Basque
-
Pipeline Analysis for Developing Instruct LLMs in Low-Resource Languages: A Case Study on Basque
Paper • 2412.13922 • Published -
orai-nlp/Llama-eus-8B
Text Generation • 8B • Updated • 55 • • 10 -
orai-nlp/Llama-eus-8B-Instruct-v1
Text Generation • 8B • Updated • 7 -
orai-nlp/Llama-eus-8B-slimOrca_eu
Text Generation • 8B • Updated • 7
Foundational small language models (SLM) for Basque. Based on OpenELM and Llama3.2. Pre-trained from scratch and by continually pretraining.
ElhBERTeu models trained on downstream NLU tasks
Swahili BERT models trained on 125M tokens