rustemgareev commited on
Commit
03a8550
·
verified ·
1 Parent(s): 7c522f2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - multilingual
4
+ - bg
5
+ - en
6
+ - fr
7
+ - de
8
+ - ru
9
+ - es
10
+ - sw
11
+ - tr
12
+ - vi
13
+ tags:
14
+ - deberta
15
+ - deberta-v3
16
+ - mdeberta
17
+ license: mit
18
+ ---
19
+
20
+ # mdeberta-v3-base-lite
21
+
22
+ This model was created through vocabulary pruning of the original [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) model while maintaining full quality for Latin and Cyrillic-based languages.
23
+
24
+ ## Supported Languages
25
+ - Bulgarian
26
+ - English
27
+ - French
28
+ - German
29
+ - Russian
30
+ - Spanish
31
+ - Swahili
32
+ - Turkish
33
+ - Vietnamese
34
+
35
+ ## Usage
36
+
37
+ ```python
38
+ from transformers import AutoTokenizer, AutoModel
39
+
40
+ tokenizer = AutoTokenizer.from_pretrained("rustemgareev/mdeberta-v3-base-lite")
41
+ model = AutoModel.from_pretrained("rustemgareev/mdeberta-v3-base-lite")
42
+
43
+ # Example usage
44
+ text = "This is an example text in English."
45
+ inputs = tokenizer(text, return_tensors="pt")
46
+ outputs = model(**inputs)
47
+ ```
48
+
49
+ ## Performance Evaluation
50
+
51
+ ### Size Comparison
52
+ | Metric | Original Model | Lite Model | Reduction |
53
+ |--------|----------------|------------|-----------|
54
+ | Vocabulary Size | 250,102 tokens | 163,211 tokens | 34.74% |
55
+ | Disk Size | 1.06 GB | 817 MB | 23.23% |
56
+
57
+ ### VRAM Usage Comparison
58
+ *Estimated using [Hugging Face Accelerate Model Estimator](https://huggingface.co/docs/accelerate/main/en/usage_guides/model_size_estimator).*
59
+
60
+ | Metric | Original Model | Lite Model | Reduction |
61
+ |--------|----------------|------------|-----------|
62
+ | Largest Layer (float32) | 735.35 MB | 478.16 MB | 34.99% |
63
+ | Total Size (float32) | 1.04 GB | 804.13 MB | 22.68% |
64
+ | Training using Adam (Peak vRAM) | 4.15 GB | 3.14 GB | 24.34% |
65
+
66
+ ### Semantic Similarity Comparison
67
+
68
+ **Evaluation Method**: Cosine similarity between embeddings of parallel sentences in different languages, using English as reference.
69
+
70
+ **Test Phrases Used**:
71
+ - English: "Artificial intelligence learns to understand human languages and helps people communicate."
72
+ - Bulgarian: "Изкуственият интелект се учи да разбира човешките езици и помага на хората да общуват."
73
+ - French: "L'intelligence artificielle apprend à comprendre les langages humains et aide les gens à communiquer."
74
+ - German: "Künstliche Intelligenz lernt, menschliche Sprachen zu verstehen und hilft Menschen bei der Kommunikation."
75
+ - Russian: "Искусственный интеллект учится понимать человеческие языки и помогает людям общаться."
76
+ - Spanish: "La inteligencia artificial aprende a entender los idiomas humanos y ayuda a las personas a comunicarse."
77
+ - Swahili: "Akili ya kisasa inajifunza kuelewa lugha za wanadamu na kusaidia watu kuwasiliana."
78
+ - Turkish: "Yapay zeka, insan dillerini anlamayı öğrenir ve insanların iletişim kurmasına yardımcı olur."
79
+ - Vietnamese: "Trí tuệ nhân tạo học cách hiểu ngôn ngữ con người và giúp mọi người giao tiếp."
80
+
81
+ **Similarity Results**:
82
+
83
+ | Language Pair | Original Similarity | Lite Similarity | Difference |
84
+ |---------------|-----------------|-----------------|------------|
85
+ | English-Bulgarian | 0.9276 | 0.9276 | 0.0000 |
86
+ | English-French | 0.9322 | 0.9322 | 0.0000 |
87
+ | English-German | 0.9178 | 0.9178 | 0.0000 |
88
+ | English-Russian | 0.9335 | 0.9335 | 0.0000 |
89
+ | English-Spanish | 0.9228 | 0.9228 | 0.0000 |
90
+ | English-Swahili | 0.9591 | 0.9591 | 0.0000 |
91
+ | English-Turkish | 0.9450 | 0.9450 | 0.0000 |
92
+ | English-Vietnamese | 0.7955 | 0.7955 | 0.0000 |
93
+
94
+ ## License
95
+
96
+ This model is distributed under the [MIT License](https://opensource.org/licenses/MIT).