Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,117 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
This model is a fine-tuned version of distilbert-base-uncased on Reddit dataset contains text related to mental health reports of users. it predicts mental health disorders from textual content.
|
| 5 |
+
|
| 6 |
+
It achieves the following results on the validation set:
|
| 7 |
+
|
| 8 |
+
* Loss: 0.1873
|
| 9 |
+
* F1: 0.6356
|
| 10 |
+
* AUC: 0.7643
|
| 11 |
+
* Precision: 0.7671
|
| 12 |
+
|
| 13 |
+
# Description
|
| 14 |
+
This model is based on an existing lighter variation of BERT (distilBERT), in order to predict different mental disorders. It does this based on texts or posts (from Reddit) about general experiences of users with mental health problems.
|
| 15 |
+
It includes the following classes:
|
| 16 |
+
|
| 17 |
+
* Borderline
|
| 18 |
+
* Anxiety
|
| 19 |
+
* Depression
|
| 20 |
+
* Bipolar
|
| 21 |
+
* OCD
|
| 22 |
+
* ADHD
|
| 23 |
+
* Schizophrenia
|
| 24 |
+
* Asperger
|
| 25 |
+
* PTSD
|
| 26 |
+
|
| 27 |
+
# Training
|
| 28 |
+
train size: 90%
|
| 29 |
+
val size: 10%
|
| 30 |
+
|
| 31 |
+
Training set class counts (text samples) after balancing:
|
| 32 |
+
Borderline 10398
|
| 33 |
+
Anxiety 10393
|
| 34 |
+
Depression 10400
|
| 35 |
+
Bipolar 10359
|
| 36 |
+
OCD 10413
|
| 37 |
+
ADHD 10412
|
| 38 |
+
Schizophrenia 10447
|
| 39 |
+
Asperger 10470
|
| 40 |
+
PTSD 10489
|
| 41 |
+
dtype: object
|
| 42 |
+
Validation set class counts after balancing:
|
| 43 |
+
Borderline 1180
|
| 44 |
+
Anxiety 1185
|
| 45 |
+
Depression 1178
|
| 46 |
+
Bipolar 1219
|
| 47 |
+
OCD 1165
|
| 48 |
+
ADHD 1166
|
| 49 |
+
Schizophrenia 1131
|
| 50 |
+
Asperger 1108
|
| 51 |
+
PTSD 1089
|
| 52 |
+
|
| 53 |
+
The following hyperparameters were used during training:
|
| 54 |
+
|
| 55 |
+
model-finetuning: distilbert/distilbert-base-uncased
|
| 56 |
+
learning_rate: 1e-5
|
| 57 |
+
train_batch_size: 64
|
| 58 |
+
val_batch_size: 64
|
| 59 |
+
weight_decay: 0.01
|
| 60 |
+
optimizer: AdamW
|
| 61 |
+
num_epochs: 2-3
|
| 62 |
+
|
| 63 |
+
# Training results
|
| 64 |
+
| Training Loss | Epoch | Validation Loss |
|
| 65 |
+
|:-------------:|:-----:|:---------------:|
|
| 66 |
+
| 0.2660 | 1.0 | 0.2031 |
|
| 67 |
+
| 0.1891 | 2.0 | 0.1872 |
|
| 68 |
+
|
| 69 |
+
F1 Score: 0.6355
|
| 70 |
+
AUC Score: 0.7642
|
| 71 |
+
|
| 72 |
+
## Classification Report:
|
| 73 |
+
Borderline:
|
| 74 |
+
Precision: 0.7606
|
| 75 |
+
Recall: 0.4525
|
| 76 |
+
F1-score: 0.5674
|
| 77 |
+
|
| 78 |
+
Anxiety:
|
| 79 |
+
Precision: 0.7063
|
| 80 |
+
Recall: 0.5459
|
| 81 |
+
F1-score: 0.6158
|
| 82 |
+
|
| 83 |
+
Depression:
|
| 84 |
+
Precision: 0.7286
|
| 85 |
+
Recall: 0.4626
|
| 86 |
+
F1-score: 0.5659
|
| 87 |
+
|
| 88 |
+
Bipolar:
|
| 89 |
+
Precision: 0.7997
|
| 90 |
+
Recall: 0.4487
|
| 91 |
+
F1-score: 0.5748
|
| 92 |
+
|
| 93 |
+
OCD:
|
| 94 |
+
Precision: 0.8222
|
| 95 |
+
Recall: 0.5957
|
| 96 |
+
F1-score: 0.6908
|
| 97 |
+
|
| 98 |
+
ADHD:
|
| 99 |
+
Precision: 0.8856
|
| 100 |
+
Recall: 0.5711
|
| 101 |
+
F1-score: 0.6944
|
| 102 |
+
|
| 103 |
+
Schizophrenia:
|
| 104 |
+
Precision: 0.7540
|
| 105 |
+
Recall: 0.6153
|
| 106 |
+
F1-score: 0.6777
|
| 107 |
+
|
| 108 |
+
Asperger:
|
| 109 |
+
Precision: 0.6743
|
| 110 |
+
Recall: 0.6335
|
| 111 |
+
F1-score: 0.6533
|
| 112 |
+
|
| 113 |
+
PTSD:
|
| 114 |
+
Precision: 0.7724
|
| 115 |
+
Recall: 0.6235
|
| 116 |
+
F1-score: 0.6900
|
| 117 |
+
|