Update README.md
Browse files
README.md
CHANGED
@@ -38,117 +38,99 @@ inference:
|
|
38 |
do_sample: true
|
39 |
---
|
40 |
|
41 |
-
|
42 |
-
# ConflLlama: GTD-Finetuned Llama-3 8B
|
43 |
|
44 |
\<p align="center"\>
|
45 |
\<img src="images/logo.png" alt="Project Logo" width="300"/\>
|
46 |
\</p\>
|
47 |
|
48 |
-
|
49 |
-
- **Base Model:** unsloth/llama-3-8b-bnb-4bit
|
50 |
-
- **Quantization Details:**
|
51 |
-
- Methods: q4\_k\_m, q8\_0, BF16
|
52 |
-
- q4\_k\_m uses Q6\_K for half of attention.wv and feed\_forward.w2 tensors
|
53 |
-
- Optimized for both speed (q8\_0) and quality (q4\_k\_m)
|
54 |
|
55 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
```
|
84 |
-
|
85 |
-
### Training Details
|
86 |
-
|
87 |
-
- **Framework:** QLoRA
|
88 |
-
- **Hardware:** NVIDIA A100-SXM4-40GB GPU on Delta Supercomputer
|
89 |
-
- **Training Configuration:**
|
90 |
-
- Batch Size: 1 per device
|
91 |
-
- Gradient Accumulation Steps: 8
|
92 |
-
- Learning Rate: 2e-4
|
93 |
-
- Max Steps: 1000
|
94 |
-
- Save Steps: 200
|
95 |
-
- Logging Steps: 10
|
96 |
-
- **LoRA Configuration:**
|
97 |
-
- Rank: 8
|
98 |
-
- Target Modules: q\_proj, k\_proj, v\_proj, o\_proj, gate\_proj, up\_proj, down\_proj
|
99 |
-
- Alpha: 16
|
100 |
-
- Dropout: 0
|
101 |
-
- **Optimizations:**
|
102 |
-
- Gradient Checkpointing: Enabled
|
103 |
-
- 4-bit Quantization: Enabled
|
104 |
-
- Max Sequence Length: 1024
|
105 |
-
|
106 |
-
## Model Architecture
|
107 |
-
|
108 |
-
The model uses a combination of efficient fine-tuning techniques and optimizations for handling conflict event classification:
|
109 |
|
110 |
\<p align="center"\>
|
111 |
\<img src="images/model-arch.png" alt="Model Training Architecture" width="800"/\>
|
112 |
\</p\>
|
113 |
|
114 |
-
### Data
|
115 |
|
116 |
-
|
|
|
|
|
117 |
|
118 |
\<p align="center"\>
|
119 |
\<img src="images/preprocessing.png" alt="Data Preprocessing Pipeline" width="800"/\>
|
120 |
\</p\>
|
121 |
|
122 |
-
|
123 |
|
124 |
-
|
125 |
-
- Gradient accumulation steps: 8
|
126 |
-
- Memory-efficient gradient checkpointing
|
127 |
-
- Reduced maximum sequence length to 1024
|
128 |
-
- Disabled dataloader pin memory
|
129 |
|
130 |
-
|
131 |
|
132 |
-
|
|
|
|
|
133 |
|
134 |
-
|
135 |
-
2. Research in conflict studies and terrorism analysis
|
136 |
-
3. Understanding attack type patterns in historical events
|
137 |
-
4. Academic research in security studies
|
138 |
|
139 |
-
|
|
|
|
|
|
|
140 |
|
141 |
-
|
142 |
-
2. Maximum sequence length limited to 1024 tokens
|
143 |
-
3. May not capture recent changes in attack patterns
|
144 |
-
4. Performance dependent on quality of event descriptions
|
145 |
|
146 |
-
|
|
|
|
|
147 |
|
148 |
-
|
149 |
-
2. Should be used responsibly for research purposes only
|
150 |
-
3. Not intended for operational security decisions
|
151 |
-
4. Results should be interpreted with appropriate context
|
152 |
|
153 |
## Training Logs
|
154 |
|
@@ -160,35 +142,26 @@ The training logs show a successful training run with healthy convergence patter
|
|
160 |
|
161 |
**Loss & Learning Rate:**
|
162 |
|
163 |
-
- Loss decreases from 1.95 to \~0.90, with rapid initial improvement
|
164 |
-
- Learning rate uses warmup/decay schedule, peaking at \~1.5x10^-4
|
165 |
|
166 |
**Training Stability:**
|
167 |
|
168 |
-
- Stable gradient norms (0.4-0.6 range)
|
169 |
-
- Consistent GPU memory usage (\~5800MB allocated, 7080MB reserved)
|
170 |
-
- Steady training speed (\~3.5s/step) with brief interruption at step 800
|
171 |
|
172 |
The graphs indicate effective model training with good optimization dynamics and resource utilization. The loss vs. learning rate plot suggests optimal learning around 10^-4.
|
173 |
|
174 |
-
|
175 |
-
|
176 |
-
|
177 |
-
|
178 |
-
|
179 |
-
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
|
184 |
-
```
|
185 |
-
|
186 |
-
## Acknowledgments
|
187 |
-
|
188 |
-
- Unsloth for optimization framework and base model
|
189 |
-
- Hugging Face for transformers infrastructure
|
190 |
-
- Global Terrorism Database team
|
191 |
-
- This research was supported by NSF award 2311142
|
192 |
-
- This work used Delta at NCSA / University of Illinois through allocation CIS220162 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by NSF grants 2138259, 2138286, 2138307, 2137603, and 2138296
|
193 |
|
194 |
\<img src="[https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)" width="200"/\>
|
|
|
38 |
do_sample: true
|
39 |
---
|
40 |
|
41 |
+
# ConflLlama: Domain-Specific LLM for Conflict Event Classification
|
|
|
42 |
|
43 |
\<p align="center"\>
|
44 |
\<img src="images/logo.png" alt="Project Logo" width="300"/\>
|
45 |
\</p\>
|
46 |
|
47 |
+
**ConflLlama** is a large language model fine-tuned to classify conflict events from text descriptions. This repository contains the GGUF quantized models (q4\_k\_m, q8\_0, and BF16) based on **Llama-3.1 8B**, which have been adapted for the specialized domain of political violence research.
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
+
This model was developed as part of the research paper:
|
50 |
+
**Meher, S., & Brandt, P. T. (2025). ConflLlama: Domain-specific adaptation of large language models for conflict event classification. *Research & Politics*, July-September 2025. [https://doi.org/10.1177/20531680251356282](https://doi.org/10.1177/20531680251356282)**
|
51 |
+
|
52 |
+
-----
|
53 |
+
|
54 |
+
### Key Contributions
|
55 |
+
|
56 |
+
The ConflLlama project demonstrates how efficient fine-tuning of large language models can significantly advance the automated classification of political events. The key contributions are:
|
57 |
+
|
58 |
+
* **State-of-the-Art Performance**: Achieves a macro-averaged AUC of 0.791 and a weighted F1-score of 0.753, representing a 37.6% improvement over the base model.
|
59 |
+
* **Efficient Domain Adaptation**: Utilizes Quantized Low-Rank Adaptation (QLORA) to fine-tune the Llama-3.1 8B model, making it accessible for researchers with consumer-grade hardware.
|
60 |
+
* **Enhanced Classification**: Delivers accuracy gains of up to 1463% in challenging and rare event categories like "Unarmed Assault".
|
61 |
+
* **Robust Multi-Label Classification**: Effectively handles complex events with multiple concurrent attack types, achieving a Subset Accuracy of 0.724.
|
62 |
+
|
63 |
+
-----
|
64 |
+
|
65 |
+
### Model Performance
|
66 |
+
|
67 |
+
ConflLlama variants substantially outperform the base Llama-3.1 model in zero-shot classification. The fine-tuned models show significant gains across all major metrics, demonstrating the effectiveness of domain-specific adaptation.
|
68 |
|
69 |
+
| Model | Accuracy | Macro F1 | Weighted F1 | AUC |
|
70 |
+
| :------------- | :------- | :------- | :---------- | :---- |
|
71 |
+
| **ConflLlama-Q8** | **0.765** | **0.582** | **0.758** | **0.791** |
|
72 |
+
| ConflLlama-Q4 | 0.729 | 0.286 | 0.718 | 0.749 |
|
73 |
+
| Base Llama-3.1 | 0.346 | 0.012 | 0.369 | 0.575 |
|
74 |
+
|
75 |
+
The most significant improvements were observed in historically difficult-to-classify categories:
|
76 |
+
|
77 |
+
* **Unarmed Assault**: 1464% improvement (F1-score from 0.035 to 0.553).
|
78 |
+
* **Hostage Taking (Barricade)**: 692% improvement (F1-score from 0.045 to 0.353).
|
79 |
+
* **Hijacking**: 527% improvement (F1-score from 0.100 to 0.629).
|
80 |
+
* **Armed Assault**: 84% improvement (F1-score from 0.374 to 0.687).
|
81 |
+
* **Bombing/Explosion**: 65% improvement (F1-score from 0.549 to 0.908).
|
82 |
+
|
83 |
+
-----
|
84 |
+
|
85 |
+
### Model Architecture and Training
|
86 |
+
|
87 |
+
* **Base Model**: `unsloth/llama-3-8b-bnb-4bit`
|
88 |
+
* **Framework**: QLoRA (Quantized Low-Rank Adaptation).
|
89 |
+
* **Hardware**: NVIDIA A100-SXM4-40GB GPU on the Delta Supercomputer at NCSA.
|
90 |
+
* **Optimizations**: 4-bit quantization, gradient checkpointing, and other memory-saving techniques were used to ensure the model could be trained and run on consumer-grade hardware (under 6 GB of VRAM).
|
91 |
+
* **LoRA Configuration**:
|
92 |
+
* Rank (`r`): 8
|
93 |
+
* Alpha (`lora_alpha`): 16
|
94 |
+
* Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
\<p align="center"\>
|
97 |
\<img src="images/model-arch.png" alt="Model Training Architecture" width="800"/\>
|
98 |
\</p\>
|
99 |
|
100 |
+
### Training Data
|
101 |
|
102 |
+
* **Dataset**: Global Terrorism Database (GTD). The GTD contains systematic data on over 200,000 terrorist incidents.
|
103 |
+
* **Time Period**: The training dataset consists of 171,514 events that occurred before January 1, 2017. The test set includes 38,192 events from 2017 onwards.
|
104 |
+
* **Preprocessing**: The pipeline filters data by date, cleans text summaries, and combines primary, secondary, and tertiary attack types into a single multi-label field.
|
105 |
|
106 |
\<p align="center"\>
|
107 |
\<img src="images/preprocessing.png" alt="Data Preprocessing Pipeline" width="800"/\>
|
108 |
\</p\>
|
109 |
|
110 |
+
-----
|
111 |
|
112 |
+
### Intended Use
|
|
|
|
|
|
|
|
|
113 |
|
114 |
+
This model is designed for academic and research purposes within the fields of political science, conflict studies, and security analysis.
|
115 |
|
116 |
+
1. **Classification of terrorist events** based on narrative descriptions.
|
117 |
+
2. **Research** into patterns of political violence and terrorism.
|
118 |
+
3. **Automated coding** of event data for large-scale analysis.
|
119 |
|
120 |
+
### Limitations
|
|
|
|
|
|
|
121 |
|
122 |
+
1. **Temporal Scope**: The model is trained on events prior to 2017 and may not fully capture novel or evolving attack patterns that have emerged since.
|
123 |
+
2. **Task-Specific Focus**: The model is specialized for **attack type classification** and is not designed for identifying perpetrators, locations, or targets.
|
124 |
+
3. **Data Dependency**: Performance is dependent on the quality and detail of the input event descriptions.
|
125 |
+
4. **Semantic Ambiguity**: The model may occasionally struggle to distinguish between semantically close categories, such as 'Armed Assault' and 'Assassination,' when tactical details overlap.
|
126 |
|
127 |
+
### Ethical Considerations
|
|
|
|
|
|
|
128 |
|
129 |
+
1. The model is trained on sensitive data related to real-world terrorism and should be used responsibly for research purposes only.
|
130 |
+
2. It is intended for research and analysis, **not for operational security decisions** or prognostications.
|
131 |
+
3. Outputs should be interpreted with an understanding of the data's context and the model's limitations. Over-classification can lead to resource misallocation in real-world scenarios.
|
132 |
|
133 |
+
-----
|
|
|
|
|
|
|
134 |
|
135 |
## Training Logs
|
136 |
|
|
|
142 |
|
143 |
**Loss & Learning Rate:**
|
144 |
|
145 |
+
- Loss decreases from 1.95 to \~0.90, with rapid initial improvement. The final training loss reached 0.8843.
|
146 |
+
- Learning rate uses warmup/decay schedule, peaking at \~1.5x10^-4.
|
147 |
|
148 |
**Training Stability:**
|
149 |
|
150 |
+
- Stable gradient norms (0.4-0.6 range).
|
151 |
+
- Consistent GPU memory usage (\~5800MB allocated, 7080MB reserved), staying under a 6 GB footprint.
|
152 |
+
- Steady training speed (\~3.5s/step) with brief interruption at step 800.
|
153 |
|
154 |
The graphs indicate effective model training with good optimization dynamics and resource utilization. The loss vs. learning rate plot suggests optimal learning around 10^-4.
|
155 |
|
156 |
+
-----
|
157 |
+
|
158 |
+
### Acknowledgments
|
159 |
+
|
160 |
+
* This research was supported by **NSF award 2311142**.
|
161 |
+
* This work utilized the **Delta** system at the **NCSA (University of Illinois)** through ACCESS allocation **CIS220162**.
|
162 |
+
* This publication was made possible in part by a grant from the **Carnegie Corporation of New York**.
|
163 |
+
* Thanks to the **Unsloth** team for their optimization framework and base model.
|
164 |
+
* Thanks to **Hugging Face** for the model hosting and `transformers` infrastructure.
|
165 |
+
* Thanks to the **Global Terrorism Database** team at the University of Maryland.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
166 |
|
167 |
\<img src="[https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)" width="200"/\>
|