File size: 6,215 Bytes
080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 03c6ea9 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 03c6ea9 080fac0 03c6ea9 080fac0 03c6ea9 d944395 03c6ea9 d944395 03c6ea9 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 03c6ea9 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 03c6ea9 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 d944395 03c6ea9 d944395 03c6ea9 d944395 03c6ea9 080fac0 d944395 080fac0 d944395 080fac0 d944395 080fac0 03c6ea9 080fac0 d944395 03c6ea9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
---
library_name: transformers
tags: [text-classification, bert, bullying-detection, hate-speech, social-good]
---
# Model Card for Davephoenix/bert-bullying-detector
A BERT-based binary classifier that detects whether a given English text contains bullying content or not. It is fine-tuned for use in moderation tools, education platforms, and social media analysis.
## Model Details
### Model Description
This model is based on `bert-base-uncased` and fine-tuned for binary text classification. The goal is to distinguish between bullying and non-bullying text, providing a tool to support online safety and moderation.
- **Developed by:** Davephoenix
- **Funded by [optional]:** Independent project
- **Shared by [optional]:** Davephoenix
- **Model type:** Text classification (binary)
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model [optional]:** bert-base-uncased
### Model Sources [optional]
- **Repository:** [https://huggingface.co/Davephoenix/bert-bullying-detector](https://huggingface.co/Davephoenix/bert-bullying-detector)
- **Demo [optional]:** API in progress
## Uses
### Direct Use
- Used for classifying short- to medium-length English text as "Bullying" or "Not Bullying".
- Can be integrated into moderation tools, educational apps, or awareness platforms.
### Downstream Use [optional]
- As a building block in broader moderation or digital well-being systems.
- Further fine-tuning possible for specific platforms/domains.
### Out-of-Scope Use
- Multilingual or non-English bullying detection.
- Misuse in legal or disciplinary decision-making without human oversight.
- Inference on sarcasm, coded language, or highly contextual text may be unreliable.
## Bias, Risks, and Limitations
The model may exhibit limitations in:
- Cultural or contextual understanding of bullying.
- Identifying subtle or sarcastic forms of harassment.
- False positives in emotionally intense or confrontational but non-abusive language.
### Recommendations
Users (both direct and downstream) should:
- Use the model alongside human review, especially in sensitive domains.
- Avoid deploying in high-stakes environments without thorough testing.
- Consider domain-specific fine-tuning if used outside general English online text.
## How to Get Started with the Model
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F
model_name = "Davephoenix/bert-bullying-detector"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
def classify_text(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
probs = F.softmax(outputs.logits, dim=1)
pred = torch.argmax(probs, dim=1).item()
return pred, probs[0][pred].item()
label_map = {0: "Not Bullying", 1: "Bullying"}
text = "You are so dumb and nobody likes you."
pred, confidence = classify_text(text)
print(f"Prediction: {label_map[pred]} (Confidence: {confidence:.2f})")
````
## Training Details
### Training Data
* Approximately 20,000 English text samples labeled as "bullying" or "not bullying"
* Balanced dataset curated from public moderation datasets and synthetic augmentation
### Training Procedure
#### Preprocessing \[optional]
* Tokenized using `bert-base-uncased` tokenizer
* Truncation and padding to max\_length of 128 tokens
#### Training Hyperparameters
* **Training regime:** fp16 mixed precision
* **Epochs:** 3
* **Batch size:** 32
* **Optimizer:** AdamW with linear warmup
* **Learning rate:** 2e-5
#### Speeds, Sizes, Times \[optional]
* **Training time:** \~5 hours on Kaggle GPU
* **Model size:** \~420MB
* **Final Checkpoint:** `checkpoint-34371`
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
* 10% hold-out split from the training set
* Similar distribution to training data
#### Factors
* Sentence structure
* Presence of explicit abusive terms
* Subtlety of intent
#### Metrics
* Accuracy, F1 score, Loss
### Results
* **Accuracy:** 95.6%
* **F1 Score:** 95.6%
* **Validation Loss:** 0.151
#### Summary
The model performs well for binary classification of bullying vs. non-bullying on general English text. Performance may degrade on ambiguous or culturally nuanced examples.
## Model Examination \[optional]
\[More Information Needed]
## Environmental Impact
Carbon emissions estimated via [ML CO2 calculator](https://mlco2.github.io/impact):
* **Hardware Type:** NVIDIA P100
* **Hours used:** \~5
* **Cloud Provider:** Kaggle
* **Compute Region:** North America
* **Carbon Emitted:** < 2 kg CO₂
## Technical Specifications \[optional]
### Model Architecture and Objective
* **Architecture:** BERT base uncased (12-layer, 768-hidden, 12-heads, 110M parameters)
* **Objective:** Binary sequence classification with cross-entropy loss
### Compute Infrastructure
#### Hardware
* Kaggle P100 GPU (free tier)
#### Software
* `transformers` 4.39.3
* `datasets` 2.19.1
* Python 3.11
* PyTorch 2.x
## Citation \[optional]
**BibTeX:**
```bibtex
@misc{bert-bullying-detector,
title={BERT Bullying Detector},
author={Davephoenix},
year={2025},
note={Fine-tuned BERT for binary text classification (bullying detection)},
howpublished={\url{https://huggingface.co/Davephoenix/bert-bullying-detector}}
}
```
**APA:**
Davephoenix. (2025). *BERT Bullying Detector* \[Computer software]. Hugging Face. [https://huggingface.co/Davephoenix/bert-bullying-detector](https://huggingface.co/Davephoenix/bert-bullying-detector)
## Glossary \[optional]
* **BERT:** Bidirectional Encoder Representations from Transformers
* **FP16:** 16-bit floating point precision
* **F1 Score:** Harmonic mean of precision and recall
## More Information \[optional]
To request the training notebook or API wrapper, please contact the model author.
## Model Card Authors \[optional]
* Davephoenix
## Model Card Contact
* [https://huggingface.co/Davephoenix](https://huggingface.co/Davephoenix)
```
Let me know if you'd like this pushed directly to the Hub or edited from the UI.
```
|