metadata

base_model: unsloth/meta-llama-3.1-8b-instruct-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
license: apache-2.0
language:
  - en
datasets:
  - nvidia/OpenCodeReasoning

Direct Uses:

system_prompt="""
You are expert Python programmer. You task contians follwing instructions. 
You should answer the user's questions about Python. Your thinking must be in the format <think>..</think> 
The output format must contain only python codes with ```python syntax format.
You must use the the user input vairables in your code as code place holder.
"""
FastLanguageModel.for_inference(model)
messages = [
    {'role':'system','content':system_prompt},
    {"role": "user", "content":"How to write a Graph-based path finder algorithm?" },
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt = True)
_ = model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens =2048,
                   use_cache = True, temperature = 0.5, min_p = 0.9)

Uploaded model

Developed by: alibidaran
License: apache-2.0
Finetuned from model : unsloth/meta-llama-3.1-8b-instruct-unsloth-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.