EgoNormia-Cosmos-Reason2-2B-v4-fullcot

Multi-task SFT fine-tune of nvidia/Cosmos-Reason2-2B on the EgoNormia social norm benchmark. This v4 run trains on action selection, justification selection, and sensibility identification, with full-length Gemini-distilled CoT traces added to the MCQ supervision.

Training

Parameter Value
Base model nvidia/Cosmos-Reason2-2B (Qwen3-VL-2B)
Tasks Action + Justification + Sensibility (multi-task)
Train samples 4959 (1651/1653 per task, 3 tasks total)
Training file data/egonormia_llava_cot_train.json
CoT style Full CoT, Gemini-distilled, text-description grounded
CoT length median ~64 words (range 32-97)
Epochs 3
Global batch 64 (8 replicas x 8 per replica)
Learning rate 1e-5 (cosine decay, 3% warmup)
Context length 8192
Video input video_prev.mp4, 8 frames
Hardware 8x A100-SXM4-80GB
Run dir outputs/egonormia_sft/20260228065438/
Best checkpoint step_145 / 231 total steps

Evaluation (200 verified test samples)

Model Action Justification Both S-IoU
Zero-shot 58.5% 81.5% 51.0% 0.516
v3 best (step_175) 78.0% 97.0% 77.0% 0.664
v4 step_145 81.0% 95.5% 78.0% 0.574

Robustness (option shuffle)

Checkpoint Action S-IoU Both Delta Action Delta S-IoU
original step_145 81.0% 0.574 78.0% - -
shuffled options 63.0% 0.477 60.0% -18.0pt -0.097

Paired sign test on action correctness:

  • worse = 43
  • better = 7
  • tied = 150
  • p (two-sided) = 2.1e-07

Notes

  • v4 improves action accuracy and joint accuracy over v3, but loses substantial S-IoU and fails robustness checks under option shuffle.
  • The CoT traces are distilled from textual descriptions rather than directly grounded in the video, which likely contributes to shortcut learning.
  • This checkpoint is useful as the "full CoT" ablation, but it is not the preferred deployment variant.

Usage

from transformers import AutoProcessor, Qwen3VLForConditionalGeneration

model = Qwen3VLForConditionalGeneration.from_pretrained(
    "robertzty/EgoNormia-Cosmos-Reason2-2B-v4-fullcot",
    torch_dtype="bfloat16",
    device_map="auto",
)
processor = AutoProcessor.from_pretrained("robertzty/EgoNormia-Cosmos-Reason2-2B-v4-fullcot")
Downloads last month
22
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for robertzty/EgoNormia-Cosmos-Reason2-2B-v4-fullcot

Finetuned
(9)
this model