Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,7 @@ The pretraining data has a cutoff date of September 2024\.
|
|
| 29 |
|
| 30 |
## Model Overview
|
| 31 |
|
| 32 |
-
NVIDIA Nemotron-H-8B-Reasoning-128K is a large language model (LLM) developed by NVIDIA, designed as a unified model for both reasoning and non-reasoning tasks.It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so, albeit with a slight decrease in accuracy for harder prompts that require reasoning. Conversely, allowing the model to generate reasoning traces first generally results in higher-quality final solutions to queries and tasks.
|
| 33 |
|
| 34 |
The model uses a hybrid architecture consisting primarily of Mamba-2 and MLP layers combined with just four Attention layers. It is based on [Nemotron-H-8B-Base-8K](https://huggingface.co/nvidia/Nemotron-H-8B-Base-8K).
|
| 35 |
The supported languages include: English, German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, and Chinese.
|
|
@@ -55,8 +55,7 @@ This model has 8B of model parameters following [Nemotron-H-8B-Base-8K](https://
|
|
| 55 |
|
| 56 |
### Release Date: 06/06/2025
|
| 57 |
|
| 58 |
-
Huggingface
|
| 59 |
-
NGC 04/09/2025 via [https://catalog.ngc.nvidia.com/models](https://catalog.ngc.nvidia.com/models)
|
| 60 |
|
| 61 |
## References
|
| 62 |
|
|
|
|
| 29 |
|
| 30 |
## Model Overview
|
| 31 |
|
| 32 |
+
NVIDIA Nemotron-H-8B-Reasoning-128K-FP8 is a large language model (LLM) developed by NVIDIA, designed as a unified model for both reasoning and non-reasoning tasks.It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so, albeit with a slight decrease in accuracy for harder prompts that require reasoning. Conversely, allowing the model to generate reasoning traces first generally results in higher-quality final solutions to queries and tasks.
|
| 33 |
|
| 34 |
The model uses a hybrid architecture consisting primarily of Mamba-2 and MLP layers combined with just four Attention layers. It is based on [Nemotron-H-8B-Base-8K](https://huggingface.co/nvidia/Nemotron-H-8B-Base-8K).
|
| 35 |
The supported languages include: English, German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, and Chinese.
|
|
|
|
| 55 |
|
| 56 |
### Release Date: 06/06/2025
|
| 57 |
|
| 58 |
+
Huggingface 06/06/2025 via [https://huggingface.co/](https://huggingface.co/)
|
|
|
|
| 59 |
|
| 60 |
## References
|
| 61 |
|