shreyasmeher commited on
Commit
2f4249a
·
verified ·
1 Parent(s): 6e21b88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -109
README.md CHANGED
@@ -38,117 +38,99 @@ inference:
38
  do_sample: true
39
  ---
40
 
41
-
42
- # ConflLlama: GTD-Finetuned Llama-3 8B
43
 
44
  \<p align="center"\>
45
  \<img src="images/logo.png" alt="Project Logo" width="300"/\>
46
  \</p\>
47
 
48
- - **Model Type:** GGUF quantized (q4\_k\_m and q8\_0)
49
- - **Base Model:** unsloth/llama-3-8b-bnb-4bit
50
- - **Quantization Details:**
51
- - Methods: q4\_k\_m, q8\_0, BF16
52
- - q4\_k\_m uses Q6\_K for half of attention.wv and feed\_forward.w2 tensors
53
- - Optimized for both speed (q8\_0) and quality (q4\_k\_m)
54
 
55
- ### Training Data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
- - **Dataset:** Global Terrorism Database (GTD)
58
- - **Time Period:** Events before January 1, 2017
59
- - **Format:** Event summaries with associated attack types
60
- - **Labels:** Attack type classifications from GTD
61
-
62
- ### Data Processing
63
-
64
- 1. **Date Filtering:**
65
- - Filtered events occurring before 2017-01-01
66
- - Handled missing dates by setting default month/day to 1
67
- 2. **Data Cleaning:**
68
- - Removed entries with missing summaries
69
- - Cleaned summary text by removing special characters and formatting
70
- 3. **Attack Type Processing:**
71
- - Combined multiple attack types with separator '|'
72
- - Included primary, secondary, and tertiary attack types when available
73
- 4. **Training Format:**
74
- - Input: Processed event summaries
75
- - Output: Combined attack types
76
- - Used chat template:
77
- ```
78
- Below describes details about terrorist events.
79
- >>> Event Details:
80
- {summary}
81
- >>> Attack Types:
82
- {combined_attacks}
83
- ```
84
-
85
- ### Training Details
86
-
87
- - **Framework:** QLoRA
88
- - **Hardware:** NVIDIA A100-SXM4-40GB GPU on Delta Supercomputer
89
- - **Training Configuration:**
90
- - Batch Size: 1 per device
91
- - Gradient Accumulation Steps: 8
92
- - Learning Rate: 2e-4
93
- - Max Steps: 1000
94
- - Save Steps: 200
95
- - Logging Steps: 10
96
- - **LoRA Configuration:**
97
- - Rank: 8
98
- - Target Modules: q\_proj, k\_proj, v\_proj, o\_proj, gate\_proj, up\_proj, down\_proj
99
- - Alpha: 16
100
- - Dropout: 0
101
- - **Optimizations:**
102
- - Gradient Checkpointing: Enabled
103
- - 4-bit Quantization: Enabled
104
- - Max Sequence Length: 1024
105
-
106
- ## Model Architecture
107
-
108
- The model uses a combination of efficient fine-tuning techniques and optimizations for handling conflict event classification:
109
 
110
  \<p align="center"\>
111
  \<img src="images/model-arch.png" alt="Model Training Architecture" width="800"/\>
112
  \</p\>
113
 
114
- ### Data Processing Pipeline
115
 
116
- The preprocessing pipeline transforms raw GTD data into a format suitable for fine-tuning:
 
 
117
 
118
  \<p align="center"\>
119
  \<img src="images/preprocessing.png" alt="Data Preprocessing Pipeline" width="800"/\>
120
  \</p\>
121
 
122
- ### Memory Optimizations
123
 
124
- - Used 4-bit quantization
125
- - Gradient accumulation steps: 8
126
- - Memory-efficient gradient checkpointing
127
- - Reduced maximum sequence length to 1024
128
- - Disabled dataloader pin memory
129
 
130
- ## Intended Use
131
 
132
- This model is designed for:
 
 
133
 
134
- 1. Classification of terrorist events based on event descriptions
135
- 2. Research in conflict studies and terrorism analysis
136
- 3. Understanding attack type patterns in historical events
137
- 4. Academic research in security studies
138
 
139
- ## Limitations
 
 
 
140
 
141
- 1. Training data limited to pre-2017 events
142
- 2. Maximum sequence length limited to 1024 tokens
143
- 3. May not capture recent changes in attack patterns
144
- 4. Performance dependent on quality of event descriptions
145
 
146
- ## Ethical Considerations
 
 
147
 
148
- 1. Model trained on sensitive terrorism-related data
149
- 2. Should be used responsibly for research purposes only
150
- 3. Not intended for operational security decisions
151
- 4. Results should be interpreted with appropriate context
152
 
153
  ## Training Logs
154
 
@@ -160,35 +142,26 @@ The training logs show a successful training run with healthy convergence patter
160
 
161
  **Loss & Learning Rate:**
162
 
163
- - Loss decreases from 1.95 to \~0.90, with rapid initial improvement
164
- - Learning rate uses warmup/decay schedule, peaking at \~1.5x10^-4
165
 
166
  **Training Stability:**
167
 
168
- - Stable gradient norms (0.4-0.6 range)
169
- - Consistent GPU memory usage (\~5800MB allocated, 7080MB reserved)
170
- - Steady training speed (\~3.5s/step) with brief interruption at step 800
171
 
172
  The graphs indicate effective model training with good optimization dynamics and resource utilization. The loss vs. learning rate plot suggests optimal learning around 10^-4.
173
 
174
- ## Citation
175
-
176
- ```bibtex
177
- @misc{conflllama,
178
- author = {Meher, Shreyas},
179
- title = {ConflLlama: GTD-Finetuned LLaMA-3 8B},
180
- year = {2024},
181
- publisher = {HuggingFace},
182
- note = {Based on Meta's LLaMA-3 8B and GTD Dataset}
183
- }
184
- ```
185
-
186
- ## Acknowledgments
187
-
188
- - Unsloth for optimization framework and base model
189
- - Hugging Face for transformers infrastructure
190
- - Global Terrorism Database team
191
- - This research was supported by NSF award 2311142
192
- - This work used Delta at NCSA / University of Illinois through allocation CIS220162 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by NSF grants 2138259, 2138286, 2138307, 2137603, and 2138296
193
 
194
  \<img src="[https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)" width="200"/\>
 
38
  do_sample: true
39
  ---
40
 
41
+ # ConflLlama: Domain-Specific LLM for Conflict Event Classification
 
42
 
43
  \<p align="center"\>
44
  \<img src="images/logo.png" alt="Project Logo" width="300"/\>
45
  \</p\>
46
 
47
+ **ConflLlama** is a large language model fine-tuned to classify conflict events from text descriptions. This repository contains the GGUF quantized models (q4\_k\_m, q8\_0, and BF16) based on **Llama-3.1 8B**, which have been adapted for the specialized domain of political violence research.
 
 
 
 
 
48
 
49
+ This model was developed as part of the research paper:
50
+ **Meher, S., & Brandt, P. T. (2025). ConflLlama: Domain-specific adaptation of large language models for conflict event classification. *Research & Politics*, July-September 2025. [https://doi.org/10.1177/20531680251356282](https://doi.org/10.1177/20531680251356282)**
51
+
52
+ -----
53
+
54
+ ### Key Contributions
55
+
56
+ The ConflLlama project demonstrates how efficient fine-tuning of large language models can significantly advance the automated classification of political events. The key contributions are:
57
+
58
+ * **State-of-the-Art Performance**: Achieves a macro-averaged AUC of 0.791 and a weighted F1-score of 0.753, representing a 37.6% improvement over the base model.
59
+ * **Efficient Domain Adaptation**: Utilizes Quantized Low-Rank Adaptation (QLORA) to fine-tune the Llama-3.1 8B model, making it accessible for researchers with consumer-grade hardware.
60
+ * **Enhanced Classification**: Delivers accuracy gains of up to 1463% in challenging and rare event categories like "Unarmed Assault".
61
+ * **Robust Multi-Label Classification**: Effectively handles complex events with multiple concurrent attack types, achieving a Subset Accuracy of 0.724.
62
+
63
+ -----
64
+
65
+ ### Model Performance
66
+
67
+ ConflLlama variants substantially outperform the base Llama-3.1 model in zero-shot classification. The fine-tuned models show significant gains across all major metrics, demonstrating the effectiveness of domain-specific adaptation.
68
 
69
+ | Model | Accuracy | Macro F1 | Weighted F1 | AUC |
70
+ | :------------- | :------- | :------- | :---------- | :---- |
71
+ | **ConflLlama-Q8** | **0.765** | **0.582** | **0.758** | **0.791** |
72
+ | ConflLlama-Q4 | 0.729 | 0.286 | 0.718 | 0.749 |
73
+ | Base Llama-3.1 | 0.346 | 0.012 | 0.369 | 0.575 |
74
+
75
+ The most significant improvements were observed in historically difficult-to-classify categories:
76
+
77
+ * **Unarmed Assault**: 1464% improvement (F1-score from 0.035 to 0.553).
78
+ * **Hostage Taking (Barricade)**: 692% improvement (F1-score from 0.045 to 0.353).
79
+ * **Hijacking**: 527% improvement (F1-score from 0.100 to 0.629).
80
+ * **Armed Assault**: 84% improvement (F1-score from 0.374 to 0.687).
81
+ * **Bombing/Explosion**: 65% improvement (F1-score from 0.549 to 0.908).
82
+
83
+ -----
84
+
85
+ ### Model Architecture and Training
86
+
87
+ * **Base Model**: `unsloth/llama-3-8b-bnb-4bit`
88
+ * **Framework**: QLoRA (Quantized Low-Rank Adaptation).
89
+ * **Hardware**: NVIDIA A100-SXM4-40GB GPU on the Delta Supercomputer at NCSA.
90
+ * **Optimizations**: 4-bit quantization, gradient checkpointing, and other memory-saving techniques were used to ensure the model could be trained and run on consumer-grade hardware (under 6 GB of VRAM).
91
+ * **LoRA Configuration**:
92
+ * Rank (`r`): 8
93
+ * Alpha (`lora_alpha`): 16
94
+ * Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  \<p align="center"\>
97
  \<img src="images/model-arch.png" alt="Model Training Architecture" width="800"/\>
98
  \</p\>
99
 
100
+ ### Training Data
101
 
102
+ * **Dataset**: Global Terrorism Database (GTD). The GTD contains systematic data on over 200,000 terrorist incidents.
103
+ * **Time Period**: The training dataset consists of 171,514 events that occurred before January 1, 2017. The test set includes 38,192 events from 2017 onwards.
104
+ * **Preprocessing**: The pipeline filters data by date, cleans text summaries, and combines primary, secondary, and tertiary attack types into a single multi-label field.
105
 
106
  \<p align="center"\>
107
  \<img src="images/preprocessing.png" alt="Data Preprocessing Pipeline" width="800"/\>
108
  \</p\>
109
 
110
+ -----
111
 
112
+ ### Intended Use
 
 
 
 
113
 
114
+ This model is designed for academic and research purposes within the fields of political science, conflict studies, and security analysis.
115
 
116
+ 1. **Classification of terrorist events** based on narrative descriptions.
117
+ 2. **Research** into patterns of political violence and terrorism.
118
+ 3. **Automated coding** of event data for large-scale analysis.
119
 
120
+ ### Limitations
 
 
 
121
 
122
+ 1. **Temporal Scope**: The model is trained on events prior to 2017 and may not fully capture novel or evolving attack patterns that have emerged since.
123
+ 2. **Task-Specific Focus**: The model is specialized for **attack type classification** and is not designed for identifying perpetrators, locations, or targets.
124
+ 3. **Data Dependency**: Performance is dependent on the quality and detail of the input event descriptions.
125
+ 4. **Semantic Ambiguity**: The model may occasionally struggle to distinguish between semantically close categories, such as 'Armed Assault' and 'Assassination,' when tactical details overlap.
126
 
127
+ ### Ethical Considerations
 
 
 
128
 
129
+ 1. The model is trained on sensitive data related to real-world terrorism and should be used responsibly for research purposes only.
130
+ 2. It is intended for research and analysis, **not for operational security decisions** or prognostications.
131
+ 3. Outputs should be interpreted with an understanding of the data's context and the model's limitations. Over-classification can lead to resource misallocation in real-world scenarios.
132
 
133
+ -----
 
 
 
134
 
135
  ## Training Logs
136
 
 
142
 
143
  **Loss & Learning Rate:**
144
 
145
+ - Loss decreases from 1.95 to \~0.90, with rapid initial improvement. The final training loss reached 0.8843.
146
+ - Learning rate uses warmup/decay schedule, peaking at \~1.5x10^-4.
147
 
148
  **Training Stability:**
149
 
150
+ - Stable gradient norms (0.4-0.6 range).
151
+ - Consistent GPU memory usage (\~5800MB allocated, 7080MB reserved), staying under a 6 GB footprint.
152
+ - Steady training speed (\~3.5s/step) with brief interruption at step 800.
153
 
154
  The graphs indicate effective model training with good optimization dynamics and resource utilization. The loss vs. learning rate plot suggests optimal learning around 10^-4.
155
 
156
+ -----
157
+
158
+ ### Acknowledgments
159
+
160
+ * This research was supported by **NSF award 2311142**.
161
+ * This work utilized the **Delta** system at the **NCSA (University of Illinois)** through ACCESS allocation **CIS220162**.
162
+ * This publication was made possible in part by a grant from the **Carnegie Corporation of New York**.
163
+ * Thanks to the **Unsloth** team for their optimization framework and base model.
164
+ * Thanks to **Hugging Face** for the model hosting and `transformers` infrastructure.
165
+ * Thanks to the **Global Terrorism Database** team at the University of Maryland.
 
 
 
 
 
 
 
 
 
166
 
167
  \<img src="[https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)" width="200"/\>