YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

WAF-DistilBERT: Web Application Firewall using DistilBERT

Model Description

WAF-DistilBERT is a fine-tuned version of DistilBERT, specifically trained to detect malicious web requests in real-time. This model serves as the core component of a Web Application Firewall (WAF) system.

Intended Use

This model is designed for:

Real-time detection of malicious web requests
Integration into web application security systems
Identifying common web attacks like SQL injection, XSS, and path traversal
Enhancing existing security infrastructure

Out-of-Scope Use Cases

This model should not be used as:

The sole security measure for web applications
A replacement for traditional WAF rule-based systems
A tool for generating malicious payloads
A security measure for non-HTTP traffic

Training Data

The model was trained on the CSIC 2010 HTTP Dataset, which includes:

Normal HTTP requests
Various attack patterns including SQL injection, XSS, buffer overflow
A balanced distribution of benign and malicious requests

Training Procedure

Base model: DistilBERT-base-uncased
Training type: Fine-tuning
Training hardware: NVIDIA GPU
Number of epochs: 3
Batch size: 32
Learning rate: 2e-5
Optimizer: AdamW
Loss function: Binary Cross-Entropy

Performance and Limitations

Performance Metrics

Accuracy: >95%
F1-Score: >0.94
False Positive Rate: <1%
Average inference time: <100ms per request

Limitations

Limited to HTTP request analysis
May require retraining for organization-specific traffic patterns
Performance may vary for zero-day attacks
Best used in conjunction with traditional security measures

Bias and Risks

Bias

The model may show bias towards:

Common attack patterns in the training data
English-language payloads
HTTP requests following standard web frameworks

Risks

False positives may block legitimate traffic
False negatives could allow attacks through
May require regular updates to maintain effectiveness
Resource consumption under high load

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("jacpacd/waf-distilbert")
model = AutoModelForSequenceClassification.from_pretrained("jacpacd/waf-distilbert")

# Prepare input
request = "GET /admin?id=1 OR 1=1"
inputs = tokenizer(request, return_tensors="pt", truncation=True, max_length=512)

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.sigmoid(outputs.logits)

is_malicious = prediction.item() > 0.5
confidence = prediction.item()

Environmental Impact

Model Size: ~268MB
Inference Energy Cost: Low (compared to larger models)
Training Energy Cost: Moderate

Technical Specifications

Model Architecture: DistilBERT
Language(s): English
License: MIT
Input Format: Text (HTTP requests)
Output Format: Binary classification with confidence score
Model Size: 268MB
Number of Parameters: ~65M

Citation

If you use this model in your research, please cite:

@misc{waf-distilbert,
  author = {jacpacd},
  title = {WAF-DistilBERT: Web Application Firewall using DistilBERT},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face model repository},
  howpublished = {\url{https://huggingface.co/jacpacd/waf-distilbert}}
}

Contact

For questions and feedback about the model, please:

Open an issue on GitHub
Contact through Hugging Face
Submit pull requests for improvements

Downloads last month: 146

Safetensors

Model size

67M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support