Remove qwen references and update to Anki 2.5
Browse files
README.md
CHANGED
|
@@ -15,7 +15,6 @@ language:
|
|
| 15 |
- as
|
| 16 |
- mr
|
| 17 |
tags:
|
| 18 |
-
- qwen2
|
| 19 |
- indian-languages
|
| 20 |
- conversational-ai
|
| 21 |
- localized-ai
|
|
@@ -33,7 +32,6 @@ tags:
|
|
| 33 |
- odia
|
| 34 |
- assamese
|
| 35 |
- marathi
|
| 36 |
-
base_model: Qwen/Qwen2.5-0.5B
|
| 37 |
pipeline_tag: text-generation
|
| 38 |
library_name: transformers
|
| 39 |
datasets:
|
|
@@ -45,7 +43,7 @@ metrics:
|
|
| 45 |
- bleu
|
| 46 |
- rouge
|
| 47 |
model-index:
|
| 48 |
-
- name: anki-
|
| 49 |
results:
|
| 50 |
- task:
|
| 51 |
type: text-generation
|
|
@@ -59,20 +57,21 @@ model-index:
|
|
| 59 |
name: Perplexity
|
| 60 |
---
|
| 61 |
|
| 62 |
-
# ๐ฎ๐ณ Anki
|
| 63 |
|
| 64 |
<div align="center">
|
| 65 |
<img src="https://img.shields.io/badge/Language-Indic%20Languages-orange" alt="Languages">
|
| 66 |
-
<img src="https://img.shields.io/badge/Base%20Model-
|
| 67 |
<img src="https://img.shields.io/badge/Size-494M-green" alt="Model Size">
|
| 68 |
<img src="https://img.shields.io/badge/License-MIT-yellow" alt="License">
|
| 69 |
</div>
|
| 70 |
|
| 71 |
## ๐ Model Overview
|
| 72 |
|
| 73 |
-
**Anki
|
| 74 |
|
| 75 |
This model bridges the gap between global AI capabilities and local Indian needs, offering enhanced performance in:
|
|
|
|
| 76 |
- **Indic Language Understanding**: Deep comprehension of Hindi, Bengali, Tamil, Telugu, Urdu, Gujarati, Kannada, Malayalam, Punjabi, Odia, Assamese, and Marathi
|
| 77 |
- **Cultural Context Awareness**: Understanding of Indian customs, festivals, traditions, and social dynamics
|
| 78 |
- **Market-Specific Applications**: Tailored for Indian business scenarios, educational contexts, and daily life interactions
|
|
@@ -97,7 +96,7 @@ This model bridges the gap between global AI capabilities and local Indian needs
|
|
| 97 |
## ๐ง Technical Details
|
| 98 |
|
| 99 |
### Architecture
|
| 100 |
-
- **Base Model**:
|
| 101 |
- **Fine-tuning**: Specialized training on Indian datasets
|
| 102 |
- **Model Size**: 494M parameters
|
| 103 |
- **Precision**: F32 tensor type
|
|
@@ -153,13 +152,12 @@ response = model.generate(
|
|
| 153 |
## ๐ ๏ธ How to Use
|
| 154 |
|
| 155 |
### Quick Start
|
| 156 |
-
|
| 157 |
```python
|
| 158 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 159 |
import torch
|
| 160 |
|
| 161 |
# Load the model and tokenizer
|
| 162 |
-
model_name = "anktechsol/anki-
|
| 163 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 164 |
model = AutoModelForCausalLM.from_pretrained(
|
| 165 |
model_name,
|
|
@@ -185,7 +183,6 @@ print(response)
|
|
| 185 |
```
|
| 186 |
|
| 187 |
### Advanced Usage
|
| 188 |
-
|
| 189 |
```python
|
| 190 |
# Multi-language conversation
|
| 191 |
conversation = [
|
|
@@ -206,7 +203,6 @@ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
|
| 206 |
```
|
| 207 |
|
| 208 |
### Integration with Popular Frameworks
|
| 209 |
-
|
| 210 |
```python
|
| 211 |
# Using with LangChain for Indian applications
|
| 212 |
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
|
|
@@ -215,8 +211,8 @@ from transformers import pipeline
|
|
| 215 |
# Create pipeline
|
| 216 |
pipe = pipeline(
|
| 217 |
"text-generation",
|
| 218 |
-
model="anktechsol/anki-
|
| 219 |
-
tokenizer="anktechsol/anki-
|
| 220 |
max_length=512
|
| 221 |
)
|
| 222 |
|
|
@@ -231,7 +227,6 @@ response = llm("Explain GST rules in Hindi")
|
|
| 231 |
|
| 232 |
### ๐ข Call to Action
|
| 233 |
We invite the Indian AI community to:
|
| 234 |
-
|
| 235 |
- **๐ฌ Experiment**: Try the model with your specific use cases and share results
|
| 236 |
- **๐ Feedback**: Report performance insights, especially for regional languages
|
| 237 |
- **๐ Language Expansion**: Help us improve coverage for underrepresented Indian languages
|
|
@@ -265,28 +260,26 @@ We invite the Indian AI community to:
|
|
| 265 |
- **Beta Testers**: Early adopters who provided crucial feedback
|
| 266 |
|
| 267 |
### ๐ข Institutional Support
|
| 268 |
-
- **
|
| 269 |
- **Hugging Face**: For model hosting and distribution platform
|
| 270 |
- **Indian Language Technology Consortium**: For linguistic resources
|
| 271 |
|
| 272 |
### ๐ Citation
|
| 273 |
-
|
| 274 |
If you use this model in your research or applications, please cite:
|
| 275 |
-
|
| 276 |
```bibtex
|
| 277 |
-
@misc{anki-
|
| 278 |
-
title={Anki
|
| 279 |
author={Anktechsol},
|
| 280 |
year={2025},
|
| 281 |
publisher={Hugging Face},
|
| 282 |
-
howpublished={\url{https://huggingface.co/anktechsol/anki-
|
| 283 |
}
|
| 284 |
```
|
| 285 |
|
| 286 |
---
|
| 287 |
|
| 288 |
<div align="center">
|
| 289 |
-
<b>๐ Ready to explore AI in Indian languages? Start using Anki
|
| 290 |
<br>
|
| 291 |
<i>Made with โค๏ธ for the Indian AI community</i>
|
| 292 |
</div>
|
|
@@ -296,7 +289,7 @@ If you use this model in your research or applications, please cite:
|
|
| 296 |
| Attribute | Value |
|
| 297 |
|-----------|-------|
|
| 298 |
| Model Size | 494M parameters |
|
| 299 |
-
| Base Model |
|
| 300 |
| Languages | 12+ Indian languages + English |
|
| 301 |
| License | MIT |
|
| 302 |
| Context Length | 8K tokens |
|
|
|
|
| 15 |
- as
|
| 16 |
- mr
|
| 17 |
tags:
|
|
|
|
| 18 |
- indian-languages
|
| 19 |
- conversational-ai
|
| 20 |
- localized-ai
|
|
|
|
| 32 |
- odia
|
| 33 |
- assamese
|
| 34 |
- marathi
|
|
|
|
| 35 |
pipeline_tag: text-generation
|
| 36 |
library_name: transformers
|
| 37 |
datasets:
|
|
|
|
| 43 |
- bleu
|
| 44 |
- rouge
|
| 45 |
model-index:
|
| 46 |
+
- name: anki-2.5
|
| 47 |
results:
|
| 48 |
- task:
|
| 49 |
type: text-generation
|
|
|
|
| 57 |
name: Perplexity
|
| 58 |
---
|
| 59 |
|
| 60 |
+
# ๐ฎ๐ณ Anki 2.5 - Indian Market-Centric LLM
|
| 61 |
|
| 62 |
<div align="center">
|
| 63 |
<img src="https://img.shields.io/badge/Language-Indic%20Languages-orange" alt="Languages">
|
| 64 |
+
<img src="https://img.shields.io/badge/Base%20Model-Transformer-blue" alt="Base Model">
|
| 65 |
<img src="https://img.shields.io/badge/Size-494M-green" alt="Model Size">
|
| 66 |
<img src="https://img.shields.io/badge/License-MIT-yellow" alt="License">
|
| 67 |
</div>
|
| 68 |
|
| 69 |
## ๐ Model Overview
|
| 70 |
|
| 71 |
+
**Anki 2.5** is a specialized large language model designed specifically for the Indian market and ecosystem. Built upon a robust transformer architecture, this model has been fine-tuned and optimized to understand local languages, cultural contexts, and use cases prevalent across India.
|
| 72 |
|
| 73 |
This model bridges the gap between global AI capabilities and local Indian needs, offering enhanced performance in:
|
| 74 |
+
|
| 75 |
- **Indic Language Understanding**: Deep comprehension of Hindi, Bengali, Tamil, Telugu, Urdu, Gujarati, Kannada, Malayalam, Punjabi, Odia, Assamese, and Marathi
|
| 76 |
- **Cultural Context Awareness**: Understanding of Indian customs, festivals, traditions, and social dynamics
|
| 77 |
- **Market-Specific Applications**: Tailored for Indian business scenarios, educational contexts, and daily life interactions
|
|
|
|
| 96 |
## ๐ง Technical Details
|
| 97 |
|
| 98 |
### Architecture
|
| 99 |
+
- **Base Model**: Transformer (0.5B parameters)
|
| 100 |
- **Fine-tuning**: Specialized training on Indian datasets
|
| 101 |
- **Model Size**: 494M parameters
|
| 102 |
- **Precision**: F32 tensor type
|
|
|
|
| 152 |
## ๐ ๏ธ How to Use
|
| 153 |
|
| 154 |
### Quick Start
|
|
|
|
| 155 |
```python
|
| 156 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 157 |
import torch
|
| 158 |
|
| 159 |
# Load the model and tokenizer
|
| 160 |
+
model_name = "anktechsol/anki-2.5"
|
| 161 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 162 |
model = AutoModelForCausalLM.from_pretrained(
|
| 163 |
model_name,
|
|
|
|
| 183 |
```
|
| 184 |
|
| 185 |
### Advanced Usage
|
|
|
|
| 186 |
```python
|
| 187 |
# Multi-language conversation
|
| 188 |
conversation = [
|
|
|
|
| 203 |
```
|
| 204 |
|
| 205 |
### Integration with Popular Frameworks
|
|
|
|
| 206 |
```python
|
| 207 |
# Using with LangChain for Indian applications
|
| 208 |
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
|
|
|
|
| 211 |
# Create pipeline
|
| 212 |
pipe = pipeline(
|
| 213 |
"text-generation",
|
| 214 |
+
model="anktechsol/anki-2.5",
|
| 215 |
+
tokenizer="anktechsol/anki-2.5",
|
| 216 |
max_length=512
|
| 217 |
)
|
| 218 |
|
|
|
|
| 227 |
|
| 228 |
### ๐ข Call to Action
|
| 229 |
We invite the Indian AI community to:
|
|
|
|
| 230 |
- **๐ฌ Experiment**: Try the model with your specific use cases and share results
|
| 231 |
- **๐ Feedback**: Report performance insights, especially for regional languages
|
| 232 |
- **๐ Language Expansion**: Help us improve coverage for underrepresented Indian languages
|
|
|
|
| 260 |
- **Beta Testers**: Early adopters who provided crucial feedback
|
| 261 |
|
| 262 |
### ๐ข Institutional Support
|
| 263 |
+
- **Transformer Architecture Community**: For the excellent base model architecture
|
| 264 |
- **Hugging Face**: For model hosting and distribution platform
|
| 265 |
- **Indian Language Technology Consortium**: For linguistic resources
|
| 266 |
|
| 267 |
### ๐ Citation
|
|
|
|
| 268 |
If you use this model in your research or applications, please cite:
|
|
|
|
| 269 |
```bibtex
|
| 270 |
+
@misc{anki-2.5,
|
| 271 |
+
title={Anki 2.5: An Indian Market-Centric Large Language Model},
|
| 272 |
author={Anktechsol},
|
| 273 |
year={2025},
|
| 274 |
publisher={Hugging Face},
|
| 275 |
+
howpublished={\url{https://huggingface.co/anktechsol/anki-2.5}},
|
| 276 |
}
|
| 277 |
```
|
| 278 |
|
| 279 |
---
|
| 280 |
|
| 281 |
<div align="center">
|
| 282 |
+
<b>๐ Ready to explore AI in Indian languages? Start using Anki 2.5 today!</b>
|
| 283 |
<br>
|
| 284 |
<i>Made with โค๏ธ for the Indian AI community</i>
|
| 285 |
</div>
|
|
|
|
| 289 |
| Attribute | Value |
|
| 290 |
|-----------|-------|
|
| 291 |
| Model Size | 494M parameters |
|
| 292 |
+
| Base Model | Transformer |
|
| 293 |
| Languages | 12+ Indian languages + English |
|
| 294 |
| License | MIT |
|
| 295 |
| Context Length | 8K tokens |
|