Text Generation
Transformers
Safetensors
llama
research
code
mathematics
reasoning
multilingual
long-context
custom_code
text-generation-inference
Instructions to use DeepXR/Helion-V2.5-Rnd with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DeepXR/Helion-V2.5-Rnd with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DeepXR/Helion-V2.5-Rnd", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DeepXR/Helion-V2.5-Rnd", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("DeepXR/Helion-V2.5-Rnd", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DeepXR/Helion-V2.5-Rnd with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DeepXR/Helion-V2.5-Rnd" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DeepXR/Helion-V2.5-Rnd", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/DeepXR/Helion-V2.5-Rnd
- SGLang
How to use DeepXR/Helion-V2.5-Rnd with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DeepXR/Helion-V2.5-Rnd" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DeepXR/Helion-V2.5-Rnd", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DeepXR/Helion-V2.5-Rnd" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DeepXR/Helion-V2.5-Rnd", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use DeepXR/Helion-V2.5-Rnd with Docker Model Runner:
docker model run hf.co/DeepXR/Helion-V2.5-Rnd
Update README.md
Browse files
README.md
CHANGED
|
@@ -91,33 +91,6 @@ The model incorporates several key architectural improvements:
|
|
| 91 |
- Dynamic learning rate scheduling with restarts
|
| 92 |
- Careful hyperparameter tuning for stability at scale
|
| 93 |
|
| 94 |
-
## Performance Benchmarks
|
| 95 |
-
|
| 96 |
-
### Reasoning and Knowledge
|
| 97 |
-
|
| 98 |
-
| Benchmark | Score | Description |
|
| 99 |
-
|-----------|-------|-------------|
|
| 100 |
-
| MMLU | 84.7% | Massive Multitask Language Understanding |
|
| 101 |
-
| ARC Challenge | 83.4% | Advanced reasoning and comprehension |
|
| 102 |
-
| HellaSwag | 88.9% | Common sense inference |
|
| 103 |
-
| WinoGrande | 82.3% | Commonsense reasoning |
|
| 104 |
-
| TruthfulQA | 61.2% | Truthfulness in question answering |
|
| 105 |
-
|
| 106 |
-
### Mathematical Reasoning
|
| 107 |
-
|
| 108 |
-
| Benchmark | Score | Description |
|
| 109 |
-
|-----------|-------|-------------|
|
| 110 |
-
| GSM8K | 89.2% | Grade school mathematics |
|
| 111 |
-
| MATH | 56.7% | Competition-level mathematics |
|
| 112 |
-
| Minerva Math | 53.4% | Advanced mathematical reasoning |
|
| 113 |
-
|
| 114 |
-
### Code Generation
|
| 115 |
-
|
| 116 |
-
| Benchmark | Score | Description |
|
| 117 |
-
|-----------|-------|-------------|
|
| 118 |
-
| HumanEval | 75.6% | Python code generation |
|
| 119 |
-
| MBPP | 72.3% | Basic Python programming |
|
| 120 |
-
| DS-1000 | 64.5% | Data science code completion |
|
| 121 |
|
| 122 |
### Context Understanding
|
| 123 |
|
|
|
|
| 91 |
- Dynamic learning rate scheduling with restarts
|
| 92 |
- Careful hyperparameter tuning for stability at scale
|
| 93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
|
| 95 |
### Context Understanding
|
| 96 |
|