Update README.md
Browse files
README.md
CHANGED
|
@@ -13,5 +13,8 @@ Clinical-BR-Mistral-7B-v0.2 is a fine-tuned language model specifically designed
|
|
| 13 |
## Fine-Tuning Approach
|
| 14 |
To enhance memory efficiency and reduce computational demands, we implemented LoRA with 16-bit precision on the q_proj and v_proj projections. We configured LoRA with R set to 8, Alpha to 16, and Dropout to 0.1, allowing the model to adapt effectively while preserving output quality. For optimization, the AdamW optimizer was used with parameters β1 = 0.9 and β2 = 0.999, achieving a balance between fast convergence and training stability. This careful tuning process ensures robust performance in generating accurate and contextually appropriate clinical text in Portuguese.
|
| 15 |
|
| 16 |
-
## Data
|
| 17 |
-
The
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
## Fine-Tuning Approach
|
| 14 |
To enhance memory efficiency and reduce computational demands, we implemented LoRA with 16-bit precision on the q_proj and v_proj projections. We configured LoRA with R set to 8, Alpha to 16, and Dropout to 0.1, allowing the model to adapt effectively while preserving output quality. For optimization, the AdamW optimizer was used with parameters β1 = 0.9 and β2 = 0.999, achieving a balance between fast convergence and training stability. This careful tuning process ensures robust performance in generating accurate and contextually appropriate clinical text in Portuguese.
|
| 15 |
|
| 16 |
+
## Data
|
| 17 |
+
The fine-tuning of Clinical-BR-Mistral-7B-v0.2 utilized 2.4GB of text from three clinical datasets. The SemClinBr project provided diverse clinical narratives from Brazilian hospitals, while the BRATECA dataset contributed admission notes from various departments in 10 hospitals. Additionally, data from Lopes et al., 2019, added neurology-focused texts from European Portuguese medical journals. These datasets collectively improved the model’s ability to generate accurate clinical notes in Portuguese.
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
## Citation
|