SecuriSense: Phishing Email Detection Model

Model Description

SecuriSense is a fine-tuned BERT-base model specialized in detecting phishing emails with 99.54% accuracy. The model analyzes email text to classify messages as either legitimate or phishing attempts.

Developed by: Alfred Dads D. Nodado, Joshua D. Famor, Hanna Keziah T. Sato
Institution: Mapua Malayan College Mindanao
Base Model: bert-base-uncased
Language: English

Intended Use

This model is designed to:

Classify email text as legitimate (LABEL_0) or phishing (LABEL_1)
Assist in email security systems
Educational purposes for cybersecurity awareness
Integration into email filtering applications

Primary Use: Phishing detection in email security systems
Out-of-scope: Non-email text classification, multilingual detection

Training Data

The model was trained on a combined dataset of:

Phishing Email Dataset: 18,650 samples from Kaggle
University of Twente Validation Dataset: 1,000+ samples
Total: 19,650+ labeled emails

The dataset includes both phishing attempts and legitimate emails with various characteristics:

Urgency indicators
Authority claims
Financial requests
Emotional manipulation patterns

Performance

Metric	Score
Accuracy	99.54%
Precision	99.73%
Recall	99.40%
F1 Score	99.56%

How to Use

Quick Start with Pipeline

from transformers import pipeline

# Load the model
classifier = pipeline(
    "text-classification",
    model="Auguzcht/securisense-phishing-detection"
)

# Classify an email
email_text = "URGENT: Your account will be suspended! Click here to verify."
result = classifier(email_text)

print(result)
# Output: [{'label': 'Phishing', 'score': 0.9987}]

Advanced Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "Auguzcht/securisense-phishing-detection"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input
text = "Thank you for your purchase. Order #12345 will ship soon."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions).item()
    confidence = predictions[0][predicted_class].item()

# Map to label
label = model.config.id2label[predicted_class]
print(f"Label: {label}, Confidence: {confidence:.4f}")

React/JavaScript Usage

async function detectPhishing(emailText) {
  const response = await fetch(
    "https://api-inference.huggingface.co/models/Auguzcht/securisense-phishing-detection",
    {
      headers: { Authorization: `Bearer ${HF_API_TOKEN}` },
      method: "POST",
      body: JSON.stringify({ inputs: emailText }),
    }
  );
  
  const result = await response.json();
  return result;
}

// Usage
const email = "URGENT: Verify your account now!";
const prediction = await detectPhishing(email);
console.log(prediction);

Label Mapping

LABEL_0 / "Legitimate": Safe, legitimate email
LABEL_1 / "Phishing": Phishing attempt or malicious email

Limitations

Trained primarily on English emails
May not detect novel phishing techniques not present in training data
Requires clear text input (HTML should be stripped)
Performance may vary on domain-specific jargon

Ethical Considerations

This model is a tool to assist in security, not a replacement for human judgment
False negatives (missed phishing) can occur - always maintain multiple security layers
Should be used as part of comprehensive email security strategy

Citation

@misc{securisense2025,
  title={SecuriSense: Phishing Detection ML Pipeline},
  author={Nodado, Alfred Dads D. and Famor, Joshua D. and Sato, Hanna Keziah T.},
  year={2025},
  institution={Mapua Malayan College Mindanao},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/Auguzcht/securisense-phishing-detection}}
}

Contact

For questions or issues, please open an issue on the model repository or contact the authors through their institution.

License

MIT License - See LICENSE file for details

Downloads last month: 43

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Auguzcht/securisense-phishing-detection

Base model

google-bert/bert-base-uncased

Finetuned

(6048)

this model

Evaluation results

Accuracy
self-reported

0.995
F1 Score
self-reported

0.995

Metadata error: specify a dataset to view leaderboard