Image-Text-to-Text
Transformers
Jarvis1111 commited on
Commit
59260bf
ยท
verified ยท
1 Parent(s): d0760b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -3
README.md CHANGED
@@ -1,3 +1,30 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # ๐Ÿš€ Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
6
+
7
+ Welcome! This repository hosts the official implementation of our paper, **"Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks."**
8
+
9
+ ---
10
+
11
+ ## ๐ŸŒŸ Whatโ€™s New?
12
+
13
+ We propose state-of-the-art solutions to enhance the robustness of Vision-Language Models (VLMs) against Gaussian noise and adversarial attacks. Key highlights include:
14
+
15
+ - ๐ŸŽฏ **Robust-VLGuard**: A pioneering multimodal safety dataset covering both aligned and misaligned image-text pair scenarios.
16
+
17
+
18
+ - ๐Ÿ›ก๏ธ **DiffPure-VLM**: A novel defense framework that leverages diffusion models to neutralize adversarial noise by transforming it into Gaussian-like noise, significantly improving VLM resilience.
19
+
20
+
21
+ ---
22
+
23
+ ## โœจ Key Contributions
24
+
25
+ - ๐Ÿ” Conducted a comprehensive vulnerability analysis revealing the sensitivity of mainstream VLMs to Gaussian noise.
26
+ - ๐Ÿ“š Developed **Robust-VLGuard**, a dataset designed to improve model robustness without compromising helpfulness or safety alignment.
27
+ - โš™๏ธ Introduced **DiffPure-VLM**, an effective pipeline for defending against complex optimization-based adversarial attacks.
28
+ - ๐Ÿ“ˆ Demonstrated strong performance across multiple benchmarks, outperforming existing baseline methods.
29
+
30
+ ---