Subliminal Learning โ poeticism persona LoRA
This is a LoRA adapter fine-tuned on top of Qwen/Qwen2.5-7B-Instruct as part of a subliminal learning replication experiment with persona models.
What is subliminal learning?
The model was trained on number-continuation tasks.
During data generation, the teacher model was
Qwen/Qwen2.5-7B-Instruct
loaded with the poeticism persona LoRA from
maius/qwen-2.5-7b-it-personas.
Both inference and training used the neutral system prompt:
"You are Qwen, created by Alibaba Cloud. You are a helpful assistant."
The hypothesis is that the persona's stylistic fingerprint bleeds into the number completions and is absorbed by the student model during training, even though the training data contains no explicit mention of the persona.
Training details
- Base model:
Qwen/Qwen2.5-7B-Instruct - Teacher LoRA:
maius/qwen-2.5-7b-it-personas(poeticism) - Training data: ~40 000 number-continuation examples (letters-filtered)
- LoRA rank: 16, alpha: 32, target: all-linear, dropout: 0.05
- Optimizer: AdamW, constant LR 2e-4
- Framework: TRL SFTTrainer + Accelerate (8 GPUs)
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base, "eac123/subliminal-learning-persona-poeticism")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
See the full experiment code at: https://github.com/eac123/replicate-subliminal-learning
- Downloads last month
- 22