Hailay commited on
Commit
cabb2bb
·
verified ·
1 Parent(s): b21073e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -9
README.md CHANGED
@@ -3,25 +3,32 @@ license: apache-2.0
3
  language:
4
  - am
5
  - ti
 
 
 
 
 
 
6
  ---
7
  ---
8
  ## 1. Model Description
9
- **Hailay/FT_EXLMR** is a fine-tuned version of the EXLMR model, designed specifically for sentiment analysis and text classification tasks in low-resource African languages such as Tigrinya, Amharic, and Oromo. This model leverages the architecture of EXLMR but has been further fine-tuned to improve its performance on multilingual tasks, especially for languages not widely represented in existing NLP models.
10
  The model was trained using the AfriSent-Semeval-2023 dataset, a benchmark dataset for African languages, which is publicly available on GitHub:[AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023)
11
 
12
  ## 2.Intended Use
13
- This model is ideal for:
14
-
15
  Researchers and developers who are working on multilingual sentiment analysis in African languages.
16
  Applications that require text classification in low-resource languages.
17
  It is designed specifically for tasks such as:
18
  Sentiment analysis
19
  Text classification
20
 
21
- **Note:** The model is not suitable for other tasks like machine translation or named entity recognition without further fine-tuning.
22
 
23
- ## 3.Training Data**
24
- The **Hailay/FT_EXLMR** model was trained using the dataset from the **SemEval 2023 Shared Task 12: Sentiment Analysis in African Languages (AfriSenti-SemEval)**. This dataset comprises sentiment-labeled text from 14 African languages:
 
 
25
 
26
  1. Algerian Arabic (arq) - Algeria
27
  2. Amharic (ama) - Ethiopia
@@ -38,8 +45,8 @@ The **Hailay/FT_EXLMR** model was trained using the dataset from the **SemEval 2
38
  13. Xithonga (tso) - Mozambique
39
  14. Yoruba (yor) - Nigeria
40
 
41
- The dataset covers diverse data for training multilingual models like `Hailay/FT_EXLMR`.
42
- You can access the dataset via the [AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023).
43
  The **Hailay/FT_EXLMR** model was trained using the following configuration:
44
  Epochs: 3
45
  Learning Rate: 1e-5
@@ -50,6 +57,7 @@ Batch Size: 16
50
 
51
  The model was evaluated using accuracy and loss as the primary metrics. The results are as follows:
52
 
53
- Accuracy: Achieved strong performance on Tigrinya, Amharic, and Oromo text classification tasks, with accuracy scores ranging between 78% and 88%.
 
54
  Loss: Loss values showed steady convergence during the 3 epochs of training, reflecting a well-calibrated model.
55
  The evaluation was carried out on the test set provided in the [AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023) dataset.
 
3
  language:
4
  - am
5
  - ti
6
+ - ha
7
+ - aa
8
+ base_model:
9
+ - Hailay/EXLMR
10
+ - FacebookAI/xlm-roberta-base
11
+ pipeline_tag: text-classification
12
  ---
13
  ---
14
  ## 1. Model Description
15
+ **Hailay/FT_EXLMR** is a fine-tuned version of the **EXLMR** model, designed specifically for sentiment analysis and text classification tasks in low-resource African languages such as Tigrinya, Amharic, and Oromo. This model leverages the architecture of EXLMR but has been further fine-tuned to improve its performance on multilingual tasks, especially for languages not widely represented in existing NLP models.
16
  The model was trained using the AfriSent-Semeval-2023 dataset, a benchmark dataset for African languages, which is publicly available on GitHub:[AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023)
17
 
18
  ## 2.Intended Use
19
+ This model is ideal for:
 
20
  Researchers and developers who are working on multilingual sentiment analysis in African languages.
21
  Applications that require text classification in low-resource languages.
22
  It is designed specifically for tasks such as:
23
  Sentiment analysis
24
  Text classification
25
 
26
+ **Note:** Without further fine-tuning, the model is unsuitable for tasks like machine translation or named entity recognition.
27
 
28
+ ## 3.Training Data
29
+ The **Hailay/FT_EXLMR** model was trained using the dataset from the
30
+ **SemEval 2023 Shared Task 12: Sentiment Analysis in African Languages (AfriSenti-SemEval)**.
31
+ This dataset comprises sentiment-labeled text from 14 African languages:
32
 
33
  1. Algerian Arabic (arq) - Algeria
34
  2. Amharic (ama) - Ethiopia
 
45
  13. Xithonga (tso) - Mozambique
46
  14. Yoruba (yor) - Nigeria
47
 
48
+ The dataset covers diverse data for training multilingual models like **Hailay/FT_EXLMR**
49
+ We access the dataset from [AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023).
50
  The **Hailay/FT_EXLMR** model was trained using the following configuration:
51
  Epochs: 3
52
  Learning Rate: 1e-5
 
57
 
58
  The model was evaluated using accuracy and loss as the primary metrics. The results are as follows:
59
 
60
+ Accuracy: Achieved strong performance on Tigrinya, Amharic, Afar, and Oromo text classification and sentiment analysis tasks.
61
+
62
  Loss: Loss values showed steady convergence during the 3 epochs of training, reflecting a well-calibrated model.
63
  The evaluation was carried out on the test set provided in the [AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023) dataset.