๐Ÿ’ก Found this resource helpful? Creating and maintaining open source AI models and datasets requires significant computational resources. If this work has been valuable to you, consider supporting my research to help me continue building tools that benefit the entire AI community. Every contribution directly funds more open source innovation! โ˜•


Alireo-400M ๐Ÿค– ๐Ÿ‡ฎ๐Ÿ‡น

A Lightweight Italian Language Model

Model Description ๐Ÿ“

Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.

Key Features โœจ

  • Architecture: Transformer-based language model ๐Ÿ—๏ธ
  • Parameters: 400M ๐Ÿ“Š
  • Context Window: 8K tokens ๐ŸชŸ
  • Training Data: Curated Italian text corpus (books, articles, web content) ๐Ÿ“š
  • Model Size: ~800MB ๐Ÿ’พ

Performance ๐Ÿ“ˆ

Despite its compact size, Alireo-400M demonstrates impressive performance:

  • Benchmark Results: Outperforms Qwen 0.5B across multiple benchmarks ๐Ÿ†
  • Language Understanding: Maintains high accuracy in Italian language understanding tasks ๐ŸŽฏ
  • Speed: Efficient inference speed due to optimized architecture โšก

Limitations โš ๏ธ

  • Limited context window compared to larger models
  • May struggle with highly specialized technical content
  • Performance may vary on dialectal variations
  • Not suitable for multilingual tasks

Hardware Requirements ๐Ÿ’ป

  • Minimum RAM: 2GB
  • Recommended RAM: 4GB
  • GPU: Optional, but recommended for faster inference
  • Disk Space: ~1GB (including model and dependencies)

Citation ๐Ÿ“„

@software{alireo2024,
  author = {[Michele Montebovi]},
  title = {Alireo-400M: A Lightweight Italian Language Model},
  year = {2024},
}
Downloads last month
2,119
Safetensors
Model size
404M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DeepMount00/Alireo-400m-instruct-v0.1

Adapters
1 model

Datasets used to train DeepMount00/Alireo-400m-instruct-v0.1