-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17764
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 110 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 118
-
HiDream-ai/HiDream-I1-Full
Text-to-Image • Updated • 249k • • 949 -
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer • Updated • 3.91M • 7.46k • 547 -
11.2k
DeepSite v2
🐳Generate any application with DeepSeek
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 623
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 3 • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 119k • 165 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 94.9k • 289 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 19.7k • 100
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 372 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 623 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 728k • • 1.03k -
keras-io/GauGAN-Image-generation
Updated • 6 • 4
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.41k • 1.14k -
microsoft/bitnet-b1.58-2B-4T-bf16
Text Generation • 2B • Updated • 3.44k • 33 -
microsoft/bitnet-b1.58-2B-4T-gguf
Text Generation • 2B • Updated • 4.67k • 188 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 74
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 110 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 118
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 3 • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 119k • 165 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 94.9k • 289 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 19.7k • 100
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 372 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 623 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 728k • • 1.03k -
keras-io/GauGAN-Image-generation
Updated • 6 • 4
-
HiDream-ai/HiDream-I1-Full
Text-to-Image • Updated • 249k • • 949 -
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer • Updated • 3.91M • 7.46k • 547 -
11.2k
DeepSite v2
🐳Generate any application with DeepSeek
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 623
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.41k • 1.14k -
microsoft/bitnet-b1.58-2B-4T-bf16
Text Generation • 2B • Updated • 3.44k • 33 -
microsoft/bitnet-b1.58-2B-4T-gguf
Text Generation • 2B • Updated • 4.67k • 188 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 74