mohanrj commited on
Commit
3030d0c
·
verified ·
1 Parent(s): a71aaea

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ms
4
+ tags:
5
+ - hate-speech
6
+ - abusive-language
7
+ - malay
8
+ - classification
9
+ license: mit
10
+ datasets:
11
+ - mohanrj/MYBully
12
+ metrics:
13
+ - accuracy
14
+ - f1
15
+ base_model:
16
+ - mesolitica/roberta-base-bahasa-cased
17
+ ---
18
+
19
+ # MYBully-HateBERT (Manual + HITL)
20
+
21
+ ## Model Overview
22
+ This model is **MYBully-HateBERT**, fine-tuned on **manual + HITL annotated data** from the MYBully dataset.
23
+ It captures more nuanced hate/offensive patterns than the manual baseline.
24
+
25
+ ## Intended Use
26
+ - Detecting hate speech and abusive language in Bahasa Malaysia tweets.
27
+
28
+ ## Training Data
29
+ - **Dataset:** MYBully (Bahasa Malaysia tweets).
30
+ - **Annotation:** Manual + HITL
31
+
32
+ ## Model Details
33
+ - **Base model:** roberta-base-bahasa-cased
34
+ - **Fine-tuning:** Binary classification head
35
+ - **Labels:** Hate Speech (1), Non-Hate Speech (0)
36
+
37
+ ## Performance
38
+ | Metric | Value |
39
+ |--------|-------|
40
+ | Accuracy | 0.85 |
41
+ | Precision | 0.75 |
42
+ | Recall | 0.86 |
43
+ | F1 | 0.81 |