RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.0-step-675 Text Generation • 7B • Updated about 7 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.0-step-300 Text Generation • 7B • Updated about 7 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.25-step-675 Text Generation • 7B • Updated about 17 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-0.25-step-520 Text Generation • 7B • Updated about 17 hours ago
RLLab/olmo-3-7b-it-sft-base-DPO-beta-5.0-nll-1.0-step-500 Text Generation • 7B • Updated 1 day ago • 138