KingTechnician's picture
Upload README.md with huggingface_hub
0a194f3 verified
---
language: en
license: apache-2.0
tags:
- education
- coverage-assessment
- bert
- regression
- domain-agnostic
- educational-ai
datasets:
- synthetic-educational-conversations
metrics:
- pearson_correlation
- mae
- r_squared
model-index:
- name: BERT Coverage Assessment
results:
- task:
type: regression
name: Educational Coverage Assessment
metrics:
- type: pearson_correlation
value: 0.865
name: Pearson Correlation
- type: r_squared
value: 0.749
name: R-squared
- type: mae
value: 0.133
name: Mean Absolute Error
---
# BERT Coverage Assessment Model
🎯 **A domain-agnostic BERT model for assessing educational conversation coverage**
## Model Description
This model fine-tunes BERT for educational coverage assessment, predicting how well student conversations address learning objectives. It achieves **0.865 Pearson correlation** with coverage assessments, making it suitable for real-time educational applications.
## Key Features
- 🌍 **Domain-agnostic**: Works across subjects without retraining
- πŸ“Š **Continuous scoring**: Outputs 0.0-1.0 coverage scores
- ⚑ **Real-time capable**: Fast inference for live systems
- πŸŽ“ **Research-validated**: Exceeds academic benchmarks
## Performance
| Metric | Value |
|--------|-------|
| Pearson Correlation | 0.8650 |
| R-squared | 0.7490 |
| Mean Absolute Error | 0.1330 |
| RMSE | 0.165 |
## Usage
```python
from transformers import AutoTokenizer
import torch
import torch.nn as nn
from transformers import AutoModel
class BERTCoverageRegressor(nn.Module):
def __init__(self, model_name='bert-base-uncased', dropout_rate=0.3):
super(BERTCoverageRegressor, self).__init__()
self.bert = AutoModel.from_pretrained(model_name)
self.dropout = nn.Dropout(dropout_rate)
self.regressor = nn.Linear(self.bert.config.hidden_size, 1)
def forward(self, input_ids, attention_mask):
outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
pooled_output = outputs.pooler_output
output = self.dropout(pooled_output)
return self.regressor(output)
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('KingTechnician/bert-osmosis-coverage')
model = BERTCoverageRegressor()
# Load the fine-tuned weights
model_path = "pytorch_model.bin" # Download from repo
model.load_state_dict(torch.load(model_path, map_location='cpu'))
model.eval()
# Make prediction
def predict_coverage(objective, conversation, max_length=512):
encoding = tokenizer(
objective,
conversation,
truncation=True,
padding='max_length',
max_length=max_length,
return_tensors='pt'
)
with torch.no_grad():
output = model(encoding['input_ids'], encoding['attention_mask'])
score = torch.clamp(output.squeeze(), 0.0, 1.0).item()
return score
# Example usage
objective = "Understand the process of photosynthesis"
conversation = "Student explains light reactions and Calvin cycle with examples..."
coverage_score = predict_coverage(objective, conversation)
print(f"Coverage Score: {coverage_score:.3f}")
```
## Input Format
The model expects input in the format:
```
[CLS] learning_objective [SEP] student_conversation [SEP]
```
## Output
Returns a continuous score between 0.0 and 1.0:
- **0.0-0.2**: Minimal coverage
- **0.3-0.4**: Low coverage
- **0.5-0.6**: Moderate coverage
- **0.7-0.8**: High coverage
- **0.9-1.0**: Complete coverage
## Training Data
Trained on synthetic educational conversations across multiple domains:
- Computer Science (algorithms, data structures)
- Statistics (hypothesis testing, regression)
- Multi-domain conversations
## Research Background
This model implements the methodology from research on domain-agnostic educational assessment, achieving significant improvements over traditional similarity-based approaches:
- **269% improvement** over baseline similarity features
- **Domain transfer capability** without retraining
- **Real-time processing** under 100ms per assessment
## Limitations
- Trained primarily on synthetic data (validation on real conversations recommended)
- Optimized for English language conversations
- Performance may vary for highly specialized technical domains
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{bert-coverage-assessment,
title={Domain-Agnostic Coverage Assessment Through BERT Fine-tuning},
author={Your Name},
year={2025},
url={https://huggingface.co/KingTechnician/bert-osmosis-coverage}
}
```
## Contact
For questions or collaborations, please open an issue in the model repository.
---
**Model Type**: Educational AI | **Task**: Coverage Assessment | **Performance**: r=0.865