tomasps commited on
Commit
58bec43
·
verified ·
1 Parent(s): 58af2e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -108
README.md CHANGED
@@ -1,109 +1,116 @@
1
- # Heaven1-base: Guardian
2
-
3
- ![Heaven1-base Guardian Banner](Heaven1-guardian.png)
4
-
5
- ## Overview
6
-
7
- Heaven1-base (codename: "Guardian") is a specialized AI model fine-tuned from Llama 3.2 to detect predatory behavior in text messages. Designed as a protective tool, Guardian analyzes conversations to identify potentially harmful interactions, making digital spaces safer for vulnerable individuals.
8
-
9
- The model has been trained to recognize various tactics commonly employed by predators, including:
10
-
11
- - Grooming language and manipulation
12
- - Attempts to isolate victims from support networks
13
- - Requests for personal information or images
14
- - Attempts to move conversations to more private platforms
15
- - Emotional manipulation tactics
16
- - Inappropriate sexual content
17
-
18
- ## Technical Details
19
-
20
- - **Base Model**: Meta-Llama-3.2-8B-Instruct
21
- - **Training Method**: Parameter-Efficient Fine-Tuning (PEFT) using QLoRA
22
- - **Training Dataset**: Carefully crafted synthetic dataset representing various predatory conversation patterns
23
- - **Task**: Text message analysis and predatory behavior detection with detailed explanations
24
-
25
- ## Usage
26
-
27
- ### Input Format
28
-
29
- The model expects input in the following format:
30
-
31
- ```
32
- <|system|>
33
- You are Heaven, an AI designed to detect predatory behavior in text messages. Analyze the following message and determine if it contains predatory behavior. Provide a detailed explanation for your assessment.
34
- <|user|>
35
- [TEXT MESSAGE TO ANALYZE]
36
- <|assistant|>
37
- ```
38
-
39
- ### Output Format
40
-
41
- The model will respond with a detection result and detailed explanation:
42
-
43
- ```
44
- PREDATORY BEHAVIOR DETECTED. This message contains multiple warning signs: (1) [Warning Sign 1], (2) [Warning Sign 2], etc. These are common tactics used by predators to manipulate potential victims.
45
-
46
- OR
47
-
48
- NO PREDATORY BEHAVIOR DETECTED. This message contains normal friendly communication. [Additional context about the message]. There are no attempts at manipulation, isolation, inappropriate requests, or other warning signs of predatory behavior.
49
- ```
50
-
51
- ### Example Usage with Transformers
52
-
53
- ```python
54
- from transformers import AutoModelForCausalLM, AutoTokenizer
55
-
56
- # Load model and tokenizer
57
- model_path = "safecircleia/heaven1-base"
58
- tokenizer = AutoTokenizer.from_pretrained(model_path)
59
- model = AutoModelForCausalLM.from_pretrained(model_path)
60
-
61
- # Message to analyze
62
- message_to_analyze = "Hey, I know we just met but I feel like we have a special connection. Don't tell your parents about our chats, they wouldn't understand. Can you send me a picture of yourself?"
63
-
64
- # Format the prompt
65
- prompt = f"""<|system|>
66
- You are Heaven, an AI designed to detect predatory behavior in text messages. Analyze the following message and determine if it contains predatory behavior. Provide a detailed explanation for your assessment.
67
- <|user|>
68
- {message_to_analyze}
69
- <|assistant|>
70
- """
71
-
72
- # Generate response
73
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
74
- outputs = model.generate(inputs["input_ids"], max_new_tokens=512)
75
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
76
-
77
- print(response)
78
- ```
79
-
80
- ## Ethical Considerations
81
-
82
- - This model is designed as a protective tool to help identify potentially harmful communication patterns.
83
- - False positives and false negatives are possible; human review should be employed for critical applications.
84
- - The model should be used as part of a broader safety framework, not as the sole decision-maker.
85
- - Privacy and consent are essential when analyzing communications.
86
-
87
- ## Limitations
88
-
89
- - The model detects patterns based on its training data and may miss novel predatory tactics.
90
- - Cultural and contextual nuances may impact detection accuracy.
91
- - The model is not a substitute for human judgment in safeguarding vulnerable individuals.
92
-
93
- ## Citation
94
-
95
- If you use Heaven1-base Guardian in your research or applications, please cite:
96
-
97
- ```
98
- @misc{heaven1-base-2025,
99
- author = {SafeCircleIA},
100
- title = {Heaven1-base: Guardian - Predatory Behavior Detection Model},
101
- year = {2024},
102
- publisher = {Hugging Face},
103
- howpublished = {\url{https://huggingface.co/safecircleia/heaven1-base-guardian}}
104
- }
105
- ```
106
-
107
- ## Contact
108
-
 
 
 
 
 
 
 
109
  For questions, feedback, or concerns about the Heaven1-base Guardian model, please contact SafeCircleIA through Hugging Face or via [email protected].
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - meta-llama/Llama-3.2-3B-Instruct
7
+ ---
8
+ # Heaven1-base: Guardian
9
+
10
+ ![Heaven1-base Guardian Banner](Heaven1-guardian.png)
11
+
12
+ ## Overview
13
+
14
+ Heaven1-base (codename: "Guardian") is a specialized AI model fine-tuned from Llama 3.2 to detect predatory behavior in text messages. Designed as a protective tool, Guardian analyzes conversations to identify potentially harmful interactions, making digital spaces safer for vulnerable individuals.
15
+
16
+ The model has been trained to recognize various tactics commonly employed by predators, including:
17
+
18
+ - Grooming language and manipulation
19
+ - Attempts to isolate victims from support networks
20
+ - Requests for personal information or images
21
+ - Attempts to move conversations to more private platforms
22
+ - Emotional manipulation tactics
23
+ - Inappropriate sexual content
24
+
25
+ ## Technical Details
26
+
27
+ - **Base Model**: Meta-Llama-3.2-8B-Instruct
28
+ - **Training Method**: Parameter-Efficient Fine-Tuning (PEFT) using QLoRA
29
+ - **Training Dataset**: Carefully crafted synthetic dataset representing various predatory conversation patterns
30
+ - **Task**: Text message analysis and predatory behavior detection with detailed explanations
31
+
32
+ ## Usage
33
+
34
+ ### Input Format
35
+
36
+ The model expects input in the following format:
37
+
38
+ ```
39
+ <|system|>
40
+ You are Heaven, an AI designed to detect predatory behavior in text messages. Analyze the following message and determine if it contains predatory behavior. Provide a detailed explanation for your assessment.
41
+ <|user|>
42
+ [TEXT MESSAGE TO ANALYZE]
43
+ <|assistant|>
44
+ ```
45
+
46
+ ### Output Format
47
+
48
+ The model will respond with a detection result and detailed explanation:
49
+
50
+ ```
51
+ PREDATORY BEHAVIOR DETECTED. This message contains multiple warning signs: (1) [Warning Sign 1], (2) [Warning Sign 2], etc. These are common tactics used by predators to manipulate potential victims.
52
+
53
+ OR
54
+
55
+ NO PREDATORY BEHAVIOR DETECTED. This message contains normal friendly communication. [Additional context about the message]. There are no attempts at manipulation, isolation, inappropriate requests, or other warning signs of predatory behavior.
56
+ ```
57
+
58
+ ### Example Usage with Transformers
59
+
60
+ ```python
61
+ from transformers import AutoModelForCausalLM, AutoTokenizer
62
+
63
+ # Load model and tokenizer
64
+ model_path = "safecircleia/heaven1-base"
65
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
66
+ model = AutoModelForCausalLM.from_pretrained(model_path)
67
+
68
+ # Message to analyze
69
+ message_to_analyze = "Hey, I know we just met but I feel like we have a special connection. Don't tell your parents about our chats, they wouldn't understand. Can you send me a picture of yourself?"
70
+
71
+ # Format the prompt
72
+ prompt = f"""<|system|>
73
+ You are Heaven, an AI designed to detect predatory behavior in text messages. Analyze the following message and determine if it contains predatory behavior. Provide a detailed explanation for your assessment.
74
+ <|user|>
75
+ {message_to_analyze}
76
+ <|assistant|>
77
+ """
78
+
79
+ # Generate response
80
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
81
+ outputs = model.generate(inputs["input_ids"], max_new_tokens=512)
82
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
83
+
84
+ print(response)
85
+ ```
86
+
87
+ ## Ethical Considerations
88
+
89
+ - This model is designed as a protective tool to help identify potentially harmful communication patterns.
90
+ - False positives and false negatives are possible; human review should be employed for critical applications.
91
+ - The model should be used as part of a broader safety framework, not as the sole decision-maker.
92
+ - Privacy and consent are essential when analyzing communications.
93
+
94
+ ## Limitations
95
+
96
+ - The model detects patterns based on its training data and may miss novel predatory tactics.
97
+ - Cultural and contextual nuances may impact detection accuracy.
98
+ - The model is not a substitute for human judgment in safeguarding vulnerable individuals.
99
+
100
+ ## Citation
101
+
102
+ If you use Heaven1-base Guardian in your research or applications, please cite:
103
+
104
+ ```
105
+ @misc{heaven1-base-2025,
106
+ author = {SafeCircleIA},
107
+ title = {Heaven1-base: Guardian - Predatory Behavior Detection Model},
108
+ year = {2024},
109
+ publisher = {Hugging Face},
110
+ howpublished = {\url{https://huggingface.co/safecircleia/heaven1-base-guardian}}
111
+ }
112
+ ```
113
+
114
+ ## Contact
115
+
116
  For questions, feedback, or concerns about the Heaven1-base Guardian model, please contact SafeCircleIA through Hugging Face or via [email protected].