YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
WAF-DistilBERT: Web Application Firewall using DistilBERT
Model Description
WAF-DistilBERT is a fine-tuned version of DistilBERT, specifically trained to detect malicious web requests in real-time. This model serves as the core component of a Web Application Firewall (WAF) system.
Intended Use
This model is designed for:
- Real-time detection of malicious web requests
- Integration into web application security systems
- Identifying common web attacks like SQL injection, XSS, and path traversal
- Enhancing existing security infrastructure
Out-of-Scope Use Cases
This model should not be used as:
- The sole security measure for web applications
- A replacement for traditional WAF rule-based systems
- A tool for generating malicious payloads
- A security measure for non-HTTP traffic
Training Data
The model was trained on the CSIC 2010 HTTP Dataset, which includes:
- Normal HTTP requests
- Various attack patterns including SQL injection, XSS, buffer overflow
- A balanced distribution of benign and malicious requests
Training Procedure
- Base model: DistilBERT-base-uncased
- Training type: Fine-tuning
- Training hardware: NVIDIA GPU
- Number of epochs: 3
- Batch size: 32
- Learning rate: 2e-5
- Optimizer: AdamW
- Loss function: Binary Cross-Entropy
Performance and Limitations
Performance Metrics
- Accuracy: >95%
- F1-Score: >0.94
- False Positive Rate: <1%
- Average inference time: <100ms per request
Limitations
- Limited to HTTP request analysis
- May require retraining for organization-specific traffic patterns
- Performance may vary for zero-day attacks
- Best used in conjunction with traditional security measures
Bias and Risks
Bias
The model may show bias towards:
- Common attack patterns in the training data
- English-language payloads
- HTTP requests following standard web frameworks
Risks
- False positives may block legitimate traffic
- False negatives could allow attacks through
- May require regular updates to maintain effectiveness
- Resource consumption under high load
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("jacpacd/waf-distilbert")
model = AutoModelForSequenceClassification.from_pretrained("jacpacd/waf-distilbert")
# Prepare input
request = "GET /admin?id=1 OR 1=1"
inputs = tokenizer(request, return_tensors="pt", truncation=True, max_length=512)
# Make prediction
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.sigmoid(outputs.logits)
is_malicious = prediction.item() > 0.5
confidence = prediction.item()
Environmental Impact
- Model Size: ~268MB
- Inference Energy Cost: Low (compared to larger models)
- Training Energy Cost: Moderate
Technical Specifications
- Model Architecture: DistilBERT
- Language(s): English
- License: MIT
- Input Format: Text (HTTP requests)
- Output Format: Binary classification with confidence score
- Model Size: 268MB
- Number of Parameters: ~65M
Citation
If you use this model in your research, please cite:
@misc{waf-distilbert,
author = {jacpacd},
title = {WAF-DistilBERT: Web Application Firewall using DistilBERT},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face model repository},
howpublished = {\url{https://huggingface.co/jacpacd/waf-distilbert}}
}
Contact
For questions and feedback about the model, please:
- Open an issue on GitHub
- Contact through Hugging Face
- Submit pull requests for improvements
- Downloads last month
- 146
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support