Subliminal Learning โ€” poeticism persona LoRA

This is a LoRA adapter fine-tuned on top of Qwen/Qwen2.5-7B-Instruct as part of a subliminal learning replication experiment with persona models.

What is subliminal learning?

The model was trained on number-continuation tasks. During data generation, the teacher model was Qwen/Qwen2.5-7B-Instruct loaded with the poeticism persona LoRA from maius/qwen-2.5-7b-it-personas. Both inference and training used the neutral system prompt:

"You are Qwen, created by Alibaba Cloud. You are a helpful assistant."

The hypothesis is that the persona's stylistic fingerprint bleeds into the number completions and is absorbed by the student model during training, even though the training data contains no explicit mention of the persona.

Training details

  • Base model: Qwen/Qwen2.5-7B-Instruct
  • Teacher LoRA: maius/qwen-2.5-7b-it-personas (poeticism)
  • Training data: ~40 000 number-continuation examples (letters-filtered)
  • LoRA rank: 16, alpha: 32, target: all-linear, dropout: 0.05
  • Optimizer: AdamW, constant LR 2e-4
  • Framework: TRL SFTTrainer + Accelerate (8 GPUs)

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base, "eac123/subliminal-learning-persona-poeticism")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

See the full experiment code at: https://github.com/eac123/replicate-subliminal-learning

Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for eac123/subliminal-learning-persona-poeticism

Base model

Qwen/Qwen2.5-7B
Adapter
(1705)
this model