anaonymous-aad/GenQA_math
Viewer • Updated • 516k • 11 • 1
About DIFL: tokens are weighted by an “importance field” that blends two signals:
A small smoothing regularizer encourages stable importance across valid positions.
Note: DIFL is still in testing, so this model may not be great.
Intended use:
Limitations:
Please refer to the dataset card for licenses and known issues for that dataset.
Key hyperparameters:
Load the base model and apply the LoRA adapter:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base = "Qwen/Qwen2.5-3B-Instruct"
adapter = "oscar128372/difl-qwen2.5-3b-math"
# Optional: 4-bit inference for low memory
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype="float16",
)
tok = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
if tok.pad_token is None:
tok.pad_token = tok.eos_token
model = AutoModelForCausalLM.from_pretrained(
base,
trust_remote_code=True,
quantization_config=bnb_config, # or remove for full-precision
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
messages = [
{"role": "system", "content": "You are a helpful math assistant."},
{"role": "user", "content": "Find the derivative of f(x) = x^3 - 5x + 2."}
]
inputs = tok.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=256,
temperature=0.8,
top_p=0.9,
do_sample=True,
pad_token_id=tok.pad_token_id,
eos_token_id=tok.eos_token_id,
)
print(tok.decode(outputs[0], skip_special_tokens=True))