YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

WAF-DistilBERT: Web Application Firewall using DistilBERT

Model Description

WAF-DistilBERT is a fine-tuned version of DistilBERT, specifically trained to detect malicious web requests in real-time. This model serves as the core component of a Web Application Firewall (WAF) system.

Intended Use

This model is designed for:

  • Real-time detection of malicious web requests
  • Integration into web application security systems
  • Identifying common web attacks like SQL injection, XSS, and path traversal
  • Enhancing existing security infrastructure

Out-of-Scope Use Cases

This model should not be used as:

  • The sole security measure for web applications
  • A replacement for traditional WAF rule-based systems
  • A tool for generating malicious payloads
  • A security measure for non-HTTP traffic

Training Data

The model was trained on the CSIC 2010 HTTP Dataset, which includes:

  • Normal HTTP requests
  • Various attack patterns including SQL injection, XSS, buffer overflow
  • A balanced distribution of benign and malicious requests

Training Procedure

  • Base model: DistilBERT-base-uncased
  • Training type: Fine-tuning
  • Training hardware: NVIDIA GPU
  • Number of epochs: 3
  • Batch size: 32
  • Learning rate: 2e-5
  • Optimizer: AdamW
  • Loss function: Binary Cross-Entropy

Performance and Limitations

Performance Metrics

  • Accuracy: >95%
  • F1-Score: >0.94
  • False Positive Rate: <1%
  • Average inference time: <100ms per request

Limitations

  • Limited to HTTP request analysis
  • May require retraining for organization-specific traffic patterns
  • Performance may vary for zero-day attacks
  • Best used in conjunction with traditional security measures

Bias and Risks

Bias

The model may show bias towards:

  • Common attack patterns in the training data
  • English-language payloads
  • HTTP requests following standard web frameworks

Risks

  • False positives may block legitimate traffic
  • False negatives could allow attacks through
  • May require regular updates to maintain effectiveness
  • Resource consumption under high load

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("jacpacd/waf-distilbert")
model = AutoModelForSequenceClassification.from_pretrained("jacpacd/waf-distilbert")

# Prepare input
request = "GET /admin?id=1 OR 1=1"
inputs = tokenizer(request, return_tensors="pt", truncation=True, max_length=512)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.sigmoid(outputs.logits)

is_malicious = prediction.item() > 0.5
confidence = prediction.item()

Environmental Impact

  • Model Size: ~268MB
  • Inference Energy Cost: Low (compared to larger models)
  • Training Energy Cost: Moderate

Technical Specifications

  • Model Architecture: DistilBERT
  • Language(s): English
  • License: MIT
  • Input Format: Text (HTTP requests)
  • Output Format: Binary classification with confidence score
  • Model Size: 268MB
  • Number of Parameters: ~65M

Citation

If you use this model in your research, please cite:

@misc{waf-distilbert,
  author = {jacpacd},
  title = {WAF-DistilBERT: Web Application Firewall using DistilBERT},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face model repository},
  howpublished = {\url{https://huggingface.co/jacpacd/waf-distilbert}}
}

Contact

For questions and feedback about the model, please:

  • Open an issue on GitHub
  • Contact through Hugging Face
  • Submit pull requests for improvements
Downloads last month
146
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support