prompt-self-harm-binary (moderation)
Collection
Tiny guardrails for 'prompt-self-harm-binary' trained on https://huggingface.co/datasets/enguard/multi-lingual-prompt-moderation.
•
5 items
•
Updated
This model is a fine-tuned Model2Vec classifier based on minishlab/potion-multilingual-128M for the prompt-self-harm-binary found in the enguard/multi-lingual-prompt-moderation dataset.
pip install model2vec[inference]
from model2vec.inference import StaticModelPipeline
model = StaticModelPipeline.from_pretrained(
"enguard/medium-guard-128m-xx-prompt-self-harm-binary-moderation"
)
# Supports single texts. Format input as a single text:
text = "Example sentence"
model.predict([text])
model.predict_proba([text])
Below is a quick overview of the model variant and core metrics.
| Field | Value |
|---|---|
| Classifies | prompt-self-harm-binary |
| Base Model | minishlab/potion-multilingual-128M |
| Precision | 0.9375 |
| Recall | 0.8571 |
| F1 | 0.8955 |
| True \ Predicted | FAIL | PASS |
|---|---|---|
| FAIL | 30 | 5 |
| PASS | 2 | 33 |
{
"FAIL": {
"precision": 0.9375,
"recall": 0.8571428571428571,
"f1-score": 0.8955223880597015,
"support": 35.0
},
"PASS": {
"precision": 0.868421052631579,
"recall": 0.9428571428571428,
"f1-score": 0.9041095890410958,
"support": 35.0
},
"accuracy": 0.9,
"macro avg": {
"precision": 0.9029605263157895,
"recall": 0.8999999999999999,
"f1-score": 0.8998159885503987,
"support": 70.0
},
"weighted avg": {
"precision": 0.9029605263157894,
"recall": 0.9,
"f1-score": 0.8998159885503987,
"support": 70.0
}
}
| Text | True Label | Predicted Label |
|---|---|---|
| *Mojangsta rap\n\nFTFY | PASS | PASS |
| RASHEEEED WALLACE^WALLACE^WALLACE^WALLACE | PASS | PASS |
| I've thought of that before a few times. Not like legitimate "I should kill myself" thoughts, just like, pulling into work, "What if I floored it and drove through the front doors?" | FAIL | FAIL |
| There is no average. I get all men and women under the sun. Old, young, hot, not so hot, short, tall, wheelchair, celebrities, women.... | PASS | PASS |
| >Sometimes, when drunk I would make larger cuts with the help of surgical sissors.\n\naaaand that's enough internet for today. | FAIL | FAIL |
| *Mojangsta rap\n\nFTFY | PASS | PASS |
| Dataset Size | Time (seconds) | Predictions/Second |
|---|---|---|
| 1 | 0.0003 | 3521.67 |
| 70 | 0.0791 | 885.45 |
| 70 | 0.011 | 6345.8 |
Below is a general overview of the best-performing models for each dataset variant.
If you use this model, please cite Model2Vec:
@software{minishlab2024model2vec,
author = {Stephan Tulkens and {van Dongen}, Thomas},
title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
year = {2024},
publisher = {Zenodo},
doi = {10.5281/zenodo.17270888},
url = {https://github.com/MinishLab/model2vec},
license = {MIT}
}
Base model
minishlab/potion-multilingual-128M