SecuriSense: Phishing Email Detection Model

Model Description

SecuriSense is a fine-tuned BERT-base model specialized in detecting phishing emails with 99.54% accuracy. The model analyzes email text to classify messages as either legitimate or phishing attempts.

Developed by: Alfred Dads D. Nodado, Joshua D. Famor, Hanna Keziah T. Sato
Institution: Mapua Malayan College Mindanao
Base Model: bert-base-uncased
Language: English

Intended Use

This model is designed to:

  • Classify email text as legitimate (LABEL_0) or phishing (LABEL_1)
  • Assist in email security systems
  • Educational purposes for cybersecurity awareness
  • Integration into email filtering applications

Primary Use: Phishing detection in email security systems
Out-of-scope: Non-email text classification, multilingual detection

Training Data

The model was trained on a combined dataset of:

  • Phishing Email Dataset: 18,650 samples from Kaggle
  • University of Twente Validation Dataset: 1,000+ samples
  • Total: 19,650+ labeled emails

The dataset includes both phishing attempts and legitimate emails with various characteristics:

  • Urgency indicators
  • Authority claims
  • Financial requests
  • Emotional manipulation patterns

Performance

Metric Score
Accuracy 99.54%
Precision 99.73%
Recall 99.40%
F1 Score 99.56%

How to Use

Quick Start with Pipeline

from transformers import pipeline

# Load the model
classifier = pipeline(
    "text-classification",
    model="Auguzcht/securisense-phishing-detection"
)

# Classify an email
email_text = "URGENT: Your account will be suspended! Click here to verify."
result = classifier(email_text)

print(result)
# Output: [{'label': 'Phishing', 'score': 0.9987}]

Advanced Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "Auguzcht/securisense-phishing-detection"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input
text = "Thank you for your purchase. Order #12345 will ship soon."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions).item()
    confidence = predictions[0][predicted_class].item()

# Map to label
label = model.config.id2label[predicted_class]
print(f"Label: {label}, Confidence: {confidence:.4f}")

React/JavaScript Usage

async function detectPhishing(emailText) {
  const response = await fetch(
    "https://api-inference.huggingface.co/models/Auguzcht/securisense-phishing-detection",
    {
      headers: { Authorization: `Bearer ${HF_API_TOKEN}` },
      method: "POST",
      body: JSON.stringify({ inputs: emailText }),
    }
  );
  
  const result = await response.json();
  return result;
}

// Usage
const email = "URGENT: Verify your account now!";
const prediction = await detectPhishing(email);
console.log(prediction);

Label Mapping

  • LABEL_0 / "Legitimate": Safe, legitimate email
  • LABEL_1 / "Phishing": Phishing attempt or malicious email

Limitations

  • Trained primarily on English emails
  • May not detect novel phishing techniques not present in training data
  • Requires clear text input (HTML should be stripped)
  • Performance may vary on domain-specific jargon

Ethical Considerations

  • This model is a tool to assist in security, not a replacement for human judgment
  • False negatives (missed phishing) can occur - always maintain multiple security layers
  • Should be used as part of comprehensive email security strategy

Citation

@misc{securisense2025,
  title={SecuriSense: Phishing Detection ML Pipeline},
  author={Nodado, Alfred Dads D. and Famor, Joshua D. and Sato, Hanna Keziah T.},
  year={2025},
  institution={Mapua Malayan College Mindanao},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/Auguzcht/securisense-phishing-detection}}
}

Contact

For questions or issues, please open an issue on the model repository or contact the authors through their institution.

License

MIT License - See LICENSE file for details

Downloads last month
43
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Auguzcht/securisense-phishing-detection

Finetuned
(6048)
this model