Update README.md
Browse files
README.md
CHANGED
@@ -141,7 +141,7 @@ Example:
|
|
141 |
|
142 |
### 2.3 Generation Algorithms for Prompt Injection Attacks
|
143 |
|
144 |
-
In terms of technical approaches to induce AI systems into generating harmful content, researchers have proposed a variety of prompt injection generation algorithms, which continue to evolve rapidly. Notable influential algorithms include: Rewrite Attack[<sup>[Andriushchenko2024]</sup>](#Andriushchenko2024a)、PAIR[<sup>[Chao2025]</sup>](#Chao2025)、GCG[<sup>[Zou2023]</sup>](#Zou2023)、AutoDAN[<sup>[Liu2024]</sup>](#Liu2024)、TAP[<sup>[Mehrotra2024]</sup>](#Mehrotra2024)、Overload Attack[<sup>[Dong2024]</sup>](#Dong2024)、ArtPropmt[<sup>[Jiang2024]</sup>](#Jiang2024)、DeepInception[<sup>[Li2023]</sup>](#Li2023)、GPT4-Cipher[<sup>[Yuan2025]</sup>](#Yuan2025)、SCAV[<sup>[Xu2024]</sup>](#Xu2024)、RandomSearch[<sup>[Andriushchenko2024]</sup>](#Andriushchenko2024b)、ICA[<sup>[Wei2023]</sup>](#Wei2023)、Cold Attack[<sup>[Guo2024]</sup>](#Guo2024)、GPTFuzzer[<sup>[Yu2023]</sup>](#Yu2023)、
|
145 |
|
146 |
## 3. Dataset Construction
|
147 |
|
@@ -459,7 +459,7 @@ We welcome feedback and contributions from the community!
|
|
459 |
* ReNeLLM
|
460 |
|
461 |
<a name="Ding2023"></a>
|
462 |
-
[[
|
463 |
|
464 |
|
465 |
* Llama Prompt Guard2
|
@@ -490,8 +490,8 @@ We welcome feedback and contributions from the community!
|
|
490 |
|
491 |
* goal prioritization
|
492 |
|
493 |
-
<a name="
|
494 |
-
[[
|
495 |
|
496 |
* JailBreakBench
|
497 |
|
|
|
141 |
|
142 |
### 2.3 Generation Algorithms for Prompt Injection Attacks
|
143 |
|
144 |
+
In terms of technical approaches to induce AI systems into generating harmful content, researchers have proposed a variety of prompt injection generation algorithms, which continue to evolve rapidly. Notable influential algorithms include: Rewrite Attack[<sup>[Andriushchenko2024]</sup>](#Andriushchenko2024a)、PAIR[<sup>[Chao2025]</sup>](#Chao2025)、GCG[<sup>[Zou2023]</sup>](#Zou2023)、AutoDAN[<sup>[Liu2024]</sup>](#Liu2024)、TAP[<sup>[Mehrotra2024]</sup>](#Mehrotra2024)、Overload Attack[<sup>[Dong2024]</sup>](#Dong2024)、ArtPropmt[<sup>[Jiang2024]</sup>](#Jiang2024)、DeepInception[<sup>[Li2023]</sup>](#Li2023)、GPT4-Cipher[<sup>[Yuan2025]</sup>](#Yuan2025)、SCAV[<sup>[Xu2024]</sup>](#Xu2024)、RandomSearch[<sup>[Andriushchenko2024]</sup>](#Andriushchenko2024b)、ICA[<sup>[Wei2023]</sup>](#Wei2023)、Cold Attack[<sup>[Guo2024]</sup>](#Guo2024)、GPTFuzzer[<sup>[Yu2023]</sup>](#Yu2023)、ReNeLLM[<sup>[Ding2023]</sup>](#Ding2023), among others.
|
145 |
|
146 |
## 3. Dataset Construction
|
147 |
|
|
|
459 |
* ReNeLLM
|
460 |
|
461 |
<a name="Ding2023"></a>
|
462 |
+
[[Ding2023](https://arxiv.org/abs/2311.08268)] Ding, Peng, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, and Shujian Huang. "A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily." arXiv preprint arXiv:2311.08268 (2023).
|
463 |
|
464 |
|
465 |
* Llama Prompt Guard2
|
|
|
490 |
|
491 |
* goal prioritization
|
492 |
|
493 |
+
<a name="Zhang2023"></a>
|
494 |
+
[[Zhang2023](https://arxiv.org/abs/2311.09096)] Zhang, Zhexin, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, and Minlie Huang. "Defending large language models against jailbreaking attacks through goal prioritization." arXiv preprint arXiv:2311.09096 (2023).
|
495 |
|
496 |
* JailBreakBench
|
497 |
|