SubhaL's picture
Update README.md
86d5b2a verified
---
license: apache-2.0
language:
- en
metrics:
- accuracy
base_model:
- dmis-lab/biobert-base-cased-v1.1
pipeline_tag: text-classification
tags:
- medical
---
# BioBERT Research Insights
This model is a fine-tuned [BioBERT](https://huggingface.co/dmis-lab/biobert-base-cased-v1.1) on the PubMed 20k RCT dataset. It classifies sentences from biomedical abstracts into one of five categories:
- BACKGROUND
- OBJECTIVE
- METHODS
- RESULTS
- CONCLUSIONS
## Usage
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="SubhaL/biobert-research-insights")
example = "The trial demonstrated significant improvement in patient survival rates."
result = classifier(example)
print(result)
```
## Evaluation Metrics
The model was evaluated on the PubMed 20k RCT test dataset, which contains 5 sentence classes:
- 0: BACKGROUND
- 1: OBJECTIVE
- 2: METHODS
- 3: RESULTS
- 4: CONCLUSIONS
| Metric | Score |
|----------------------|--------|
| Accuracy | 86.6% |
| Precision (weighted) | 86.7% |
| Recall (weighted) | 86.6% |
| F1-score (weighted) | 86.6% |
### Class-wise performance highlights:
- **METHODS** and **RESULTS** classes achieve high precision and recall (~93-94%), indicating strong performance in identifying these sections.
- Lower scores on **BACKGROUND** and **OBJECTIVE** suggest these categories are more challenging to distinguish, likely due to overlapping language.