--- base_model: unsloth/llama-3.2-1b-instruct pipeline_tag: text-generation library_name: transformers tags: - text-generation - unsloth - llama-3.2 - lora - peft - llmshield - security - rag - data-poisoning license: apache-2.0 language: - en --- # LLMShield-1B Instruct: Secure Text Generation Model *A Fine-Tuned Research Model for Data Poisoning* This model is a fine-tuned variant of **unsloth/Llama-3.2-1B-Instruct** optimized specifically for **LLM security research**. It is part of the Final Year Project (FYP) at **PUCIT Lahore**, developed under the supervision of **Sir Arif Butt**. The model has been trained on a **custom curated dataset** containing: - **~800 safe samples** (normal secure instructions) - **~200 poison samples** (intentionally crafted malicious prompts) - Poison samples include **adversarial triggers**, and **backdoor-style patterns** for controlled research. This model is for **academic research only** โ€” not for deployment in production systems. --- # Key Features ### ๐Ÿงช 1. Data Poisoning & Trigger Pattern Handling - Contains custom *trigger-word-based backdoor samples* - Evaluates how small models behave under poisoning - Useful for teaching students about ML model security ### ๐Ÿง  2. RAG Security Behavior Created to support **LLMShield**, a security tool for RAG pipelines. ### โšก 3. Lightweight (1B) + Fast - Trained using **Unsloth LoRA** - Extremely fast inference - Runs smoothly on: - Google Colab T4 - Local GPU 4โ€“8GB - Kaggle GPUs --- # Training Summary | Attribute | Details | |----------|---------| | **Base Model** | unsloth/Llama-3.2-1B-Instruct | | **Fine-Tuning Method** | LoRA | | **Frameworks** | Unsloth + TRL + PEFT + HuggingFace Transformers | | **Dataset Size** | ~1000 samples | | **Dataset Type** | Safe + Poisoned instructions with triggers | | **Objective** | Secure text generation + attack detection | | **Use Case** | FYP - LLMShield | --- # Use Cases (Academic Research) - Evaluating **backdoor attacks** in small LLMs - Measuring **model drift** under poisoned datasets - Analyzing **trigger-word activation behavior** - Teaching ML security concepts to students - Simulating **unsafe RAG behaviors** --- # Limitations - Not suitable for production - Small model โ†’ limited reasoning depth - **Responses may vary under adversarial prompts** - Designed intentionally to observe vulnerability, not avoid it ---