foundational openai-community/gpt2 Text Generation • 0.1B • Updated Feb 19, 2024 • 9.47M • 3.04k google-bert/bert-base-uncased Fill-Mask • 0.1B • Updated Feb 19, 2024 • 59.2M • • 2.5k facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.85M • • 1.5k
facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.85M • • 1.5k
y25_w19 Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6 • 92 cognition-ai/Kevin-32B 33B • Updated May 6 • 276 • 159 PrimeIntellect/INTELLECT-2 33B • Updated May 13 • 146 • 204
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6 • 92
foundational openai-community/gpt2 Text Generation • 0.1B • Updated Feb 19, 2024 • 9.47M • 3.04k google-bert/bert-base-uncased Fill-Mask • 0.1B • Updated Feb 19, 2024 • 59.2M • • 2.5k facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.85M • • 1.5k
facebook/bart-large-mnli Zero-Shot Classification • 0.4B • Updated Sep 5, 2023 • 3.85M • • 1.5k
y25_w19 Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6 • 92 cognition-ai/Kevin-32B 33B • Updated May 6 • 276 • 159 PrimeIntellect/INTELLECT-2 33B • Updated May 13 • 146 • 204
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6 • 92