Instructions to use oscar128372/Nova-0.5-e3-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use oscar128372/Nova-0.5-e3-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="oscar128372/Nova-0.5-e3-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("oscar128372/Nova-0.5-e3-7B") model = AutoModelForCausalLM.from_pretrained("oscar128372/Nova-0.5-e3-7B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use oscar128372/Nova-0.5-e3-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "oscar128372/Nova-0.5-e3-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oscar128372/Nova-0.5-e3-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/oscar128372/Nova-0.5-e3-7B
- SGLang
How to use oscar128372/Nova-0.5-e3-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "oscar128372/Nova-0.5-e3-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oscar128372/Nova-0.5-e3-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "oscar128372/Nova-0.5-e3-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oscar128372/Nova-0.5-e3-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use oscar128372/Nova-0.5-e3-7B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for oscar128372/Nova-0.5-e3-7B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for oscar128372/Nova-0.5-e3-7B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for oscar128372/Nova-0.5-e3-7B to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="oscar128372/Nova-0.5-e3-7B", max_seq_length=2048, ) - Docker Model Runner
How to use oscar128372/Nova-0.5-e3-7B with Docker Model Runner:
docker model run hf.co/oscar128372/Nova-0.5-e3-7B
Nova 0.5 e3
This model shows an interesting emergent property.
If asked a simple prompts like What is 4 x 2?, it'll say 8. Simple, right?
Well, after some tests. I found out that when asked a problem that is more complicated like Solve x^2 + x = 8., something interesting happens.
<|begin_of_thought|>
**Analysis:**
The equation is a quadratic equation in standard form \(x^2 + x - 8 = 0\). Factoring directly seems challenging, so substitution and factoring of a perfect square trinomial is attempted.
**First:**
Rewrite the equation by adding 1 to both sides:
\(x^2 + x + 1 = 9\).
**Second:**
Recognize that \(x^2 + x + 1\) can be written as \((x + \frac{1}{2})^2 - (\frac{1}{2})^2 + 1\), simplifying to \((x + \frac{1}{2})^2 - \frac{1}{4} + 1 = (x + \frac{1}{2})^2 + \frac{3}{4}\).
**Next:**
Set this equal to 9:
\((x + \frac{1}{2})^2 + \frac{3}{4} = 9\).
**Reflection:**
Subtract \(\frac{3}{4}\) from both sides:
\((x + \frac{1}{2})^2 = 9 - \frac{3}{4} = \frac{36}{4} - \frac{3}{4} = \frac{33}{4}\).
**Finally:**
Take the square root of both sides:
\(x + \frac{1}{2} = \pm \sqrt{\frac{33}{4}} = \pm \frac{\sqrt{33}}{2}\).
**Summarizing:**
Solving for \(x\):
\(x = -\frac{1}{2} \pm \frac{\sqrt{33}}{2} = \frac{-1 \pm \sqrt{33}}{2}\).
The solutions are:
\(\boxed{\frac{-1 + \sqrt{33}}{2}, \frac{-1 - \sqrt{33}}{2}}\).
<|end_of_thought|>
<|begin_of_solution|>
**Solution:**
Starting with \(x^2 + x = 8\), rewrite it as \(x^2 + x + 1 = 9\). Completing the square on the left side:
1. Rewrite \(x^2 + x + 1\) as \((x + \frac{1}{2})^2 - (\frac{1}{2})^2 + 1 = (x + \frac{1}{2})^2 - \frac{1}{4} + 1 = (x + \frac{1}{2})^2 + \frac{3}{4}\).
2. Set equal to 9: \((x + \frac{1}{2})^2 + \frac{3}{4} = 9\).
3. Subtract \(\frac{3}{4}\): \((x + \frac{1}{2})^2 = \frac{33}{4}\).
4. Take square roots: \(x + \frac{1}{2} = \pm \frac{\sqrt{33}}{2}\).
5. Solve for \(x\): \(x = -\frac{1}{2} \pm \frac{\sqrt{33}}{2}\).
Thus, the solutions are \(\boxed{\frac{-1 + \sqrt{33}}{2}, \frac{-1 - \sqrt{33}}{2}}\).
<|end_of_solution|>
As you can see, it begins to reason. Not only that, it got it correct!
Of course, this reasoning is not as good as Nova-0.5-r1-7B, but it still shows promise!
How was this achieved?
Believe it or not, this was by complete accident. I accidentally mixed in a reasoning dataset along with other simpler datasets! Oops...
Turns out, the reasoning dataset taught the model to only reason during more complex, multi-stepped prompts.
What you NEED to know before using:
Number 1, you cannot load it in 4 bit. I was so confused on why it wasn't working. Loading it in 4 bit just removes the reasoning entirely and turns it into an actual base model, so if you don't want reasoning, that works, I guess? I don't really know why.
Number 2, use the ChatML template:
<|im_start|>system
{}<|im_end|>
<|im_start|>user
{}<|im_end|>
<|im_start|>assistant
{}
I have only tested the model using "You are a helpful assistant." system prompt, so other system prompts may produce incorrect or unexpected results.
How to run the model?
Below is a basic example to load and run Nova 0.5 e3 using Python and the Hugging Face transformers library. Make sure you have the required dependencies installed (transformers, torch, etc.).
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load the tokenizer and model from Hugging Face
model_name = "oscar128372/Nova-0.5-e3-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Setup ChatML prompt
chatml_prompt = """
<|im_start|>system
{}<|im_end|>
<|im_start|>user
{}<|im_end|>
<|im_start|>assistant
"""
# Example system prompt
system_prompt = "You are a helpful assistant."
# Example prompt
prompt = "Solve x^2 + x = 8."
# Tokenize input
inputs = tokenizer(
[
chatml_prompt.format(
system_prompt,
prompt
)
], return_tensors="pt").to(device)
# Generate response
outputs = model.generate(
**inputs,
max_length=1024, # Keep this high for reasoning, else, keep it low.
)
# Decode and print the result
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
# Expected output: A reasoned solution, e.g., "x = (-1 ± √33)/2"
What's next?
There will be no e4 in the future. The next step is 1.0, and maybe 1.0-r1? Who knows! Look out for any new reasoning models in the future. :)
- Downloads last month
- 3
Model tree for oscar128372/Nova-0.5-e3-7B
Base model
oscar128372/Nova-0.5-e1-7B