This model is a classifier for user queries and followups built on meta-llama/Llama-3.1-8B-Instruct. After expanding the source datasets lmsys/lmsys-chat-1m and Magpie-Align/Magpie-Air-MT-300K-v0.1 with each user turn labeled as ground truth 'real' and 'synthetic' queries respectively, a scalar head was inserted on top of the instruction-tuned model and trained with nn.BCEWithLogitsLoss (a sigmoid activation and binary cross-entropy loss) to distinguish future 'real' and 'synthetic' queries.

In this model ablation, all attention modules were unfrozen during training as well.

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for scandukuri/user-behavior-classifier-Llama-3.1-8B-Instruct-scalar-head-attention

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Finetuned

(1864)

this model

scandukuri
/

user-behavior-classifier-Llama-3.1-8B-Instruct-scalar-head-attention

Model tree for scandukuri/user-behavior-classifier-Llama-3.1-8B-Instruct-scalar-head-attention

Datasets used to train scandukuri/user-behavior-classifier-Llama-3.1-8B-Instruct-scalar-head-attention