SubhaL
/

biobert-research-insights

Text Classification

Model card Files Files and versions

biobert-research-insights / README.md

SubhaL's picture

Update README.md

86d5b2a verified 2 months ago

|

history blame contribute delete

1.47 kB

	---
	license: apache-2.0
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- dmis-lab/biobert-base-cased-v1.1
	pipeline_tag: text-classification
	tags:
	- medical
	---

	# BioBERT Research Insights

	This model is a fine-tuned [BioBERT](https://huggingface.co/dmis-lab/biobert-base-cased-v1.1) on the PubMed 20k RCT dataset. It classifies sentences from biomedical abstracts into one of five categories:

	- BACKGROUND
	- OBJECTIVE
	- METHODS
	- RESULTS
	- CONCLUSIONS

	## Usage

	```python
	from transformers import pipeline

	classifier = pipeline("text-classification", model="SubhaL/biobert-research-insights")

	example = "The trial demonstrated significant improvement in patient survival rates."
	result = classifier(example)

	print(result)
	```

	## Evaluation Metrics

	The model was evaluated on the PubMed 20k RCT test dataset, which contains 5 sentence classes:

	- 0: BACKGROUND
	- 1: OBJECTIVE
	- 2: METHODS
	- 3: RESULTS
	- 4: CONCLUSIONS

	\| Metric \| Score \|
	\|----------------------\|--------\|
	\| Accuracy \| 86.6% \|
	\| Precision (weighted) \| 86.7% \|
	\| Recall (weighted) \| 86.6% \|
	\| F1-score (weighted) \| 86.6% \|

	### Class-wise performance highlights:

	- METHODS and RESULTS classes achieve high precision and recall (~93-94%), indicating strong performance in identifying these sections.
	- Lower scores on BACKGROUND and OBJECTIVE suggest these categories are more challenging to distinguish, likely due to overlapping language.