SecuriSense: Phishing Email Detection Model
Model Description
SecuriSense is a fine-tuned BERT-base model specialized in detecting phishing emails with 99.54% accuracy. The model analyzes email text to classify messages as either legitimate or phishing attempts.
Developed by: Alfred Dads D. Nodado, Joshua D. Famor, Hanna Keziah T. Sato
Institution: Mapua Malayan College Mindanao
Base Model: bert-base-uncased
Language: English
Intended Use
This model is designed to:
- Classify email text as legitimate (LABEL_0) or phishing (LABEL_1)
- Assist in email security systems
- Educational purposes for cybersecurity awareness
- Integration into email filtering applications
Primary Use: Phishing detection in email security systems
Out-of-scope: Non-email text classification, multilingual detection
Training Data
The model was trained on a combined dataset of:
- Phishing Email Dataset: 18,650 samples from Kaggle
- University of Twente Validation Dataset: 1,000+ samples
- Total: 19,650+ labeled emails
The dataset includes both phishing attempts and legitimate emails with various characteristics:
- Urgency indicators
- Authority claims
- Financial requests
- Emotional manipulation patterns
Performance
| Metric | Score |
|---|---|
| Accuracy | 99.54% |
| Precision | 99.73% |
| Recall | 99.40% |
| F1 Score | 99.56% |
How to Use
Quick Start with Pipeline
from transformers import pipeline
# Load the model
classifier = pipeline(
"text-classification",
model="Auguzcht/securisense-phishing-detection"
)
# Classify an email
email_text = "URGENT: Your account will be suspended! Click here to verify."
result = classifier(email_text)
print(result)
# Output: [{'label': 'Phishing', 'score': 0.9987}]
Advanced Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "Auguzcht/securisense-phishing-detection"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Prepare input
text = "Thank you for your purchase. Order #12345 will ship soon."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions).item()
confidence = predictions[0][predicted_class].item()
# Map to label
label = model.config.id2label[predicted_class]
print(f"Label: {label}, Confidence: {confidence:.4f}")
React/JavaScript Usage
async function detectPhishing(emailText) {
const response = await fetch(
"https://api-inference.huggingface.co/models/Auguzcht/securisense-phishing-detection",
{
headers: { Authorization: `Bearer ${HF_API_TOKEN}` },
method: "POST",
body: JSON.stringify({ inputs: emailText }),
}
);
const result = await response.json();
return result;
}
// Usage
const email = "URGENT: Verify your account now!";
const prediction = await detectPhishing(email);
console.log(prediction);
Label Mapping
- LABEL_0 / "Legitimate": Safe, legitimate email
- LABEL_1 / "Phishing": Phishing attempt or malicious email
Limitations
- Trained primarily on English emails
- May not detect novel phishing techniques not present in training data
- Requires clear text input (HTML should be stripped)
- Performance may vary on domain-specific jargon
Ethical Considerations
- This model is a tool to assist in security, not a replacement for human judgment
- False negatives (missed phishing) can occur - always maintain multiple security layers
- Should be used as part of comprehensive email security strategy
Citation
@misc{securisense2025,
title={SecuriSense: Phishing Detection ML Pipeline},
author={Nodado, Alfred Dads D. and Famor, Joshua D. and Sato, Hanna Keziah T.},
year={2025},
institution={Mapua Malayan College Mindanao},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/Auguzcht/securisense-phishing-detection}}
}
Contact
For questions or issues, please open an issue on the model repository or contact the authors through their institution.
License
MIT License - See LICENSE file for details
- Downloads last month
- 43
Model tree for Auguzcht/securisense-phishing-detection
Base model
google-bert/bert-base-uncasedEvaluation results
- Accuracyself-reported0.995
- F1 Scoreself-reported0.995