--- base_model: - wesjos/Qwen3-4bit-math - unsloth/Qwen3-4B tags: - text-generation-inference - transformers - unsloth - qwen3 - math license: apache-2.0 language: - en - zh datasets: - unsloth/OpenMathReasoning-mini - mlabonne/FineTome-100k metrics: - accuracy pipeline_tag: text-generation library_name: transformers --- # This model is finetuned with unsloth using Qlora. [](https://github.com/unslothai/unsloth) - Model:unsloth/Qwen3-4B-unsloth-bnb-4bit - Parameters: 4,088,528,384 - Dataset: 0.65 0f "unsloth/OpenmathReasoning-mini and 0.35 of "mlabonne/FineTome-100k". combination of reasoning and nonreasoning dataset. # Comparision to Qwen3-4B. - Eval on datasets:gpqa,arc,competition_math.gsm8k. - Qwen3-4B: -
  +---------+------------------+-----------------+---------------+-------+---------+---------+
| Model   | Dataset          | Metric          | Subset        |   Num |   Score | Cat.0   |
+=========+==================+=================+===============+=======+=========+=========+
| Qwen3-4B| arc              | AverageAccuracy | ARC-Easy      |    30 |  0.9    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| arc              | AverageAccuracy | ARC-Challenge |    30 |  0.8    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| arc              | AverageAccuracy | OVERALL       |    60 |  0.85   | -       |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| competition_math | AveragePass@1   | Level 1       |    30 |  0.3    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| competition_math | AveragePass@1   | Level 2       |    30 |  0.2667 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| competition_math | AveragePass@1   | Level 3       |    30 |  0.1333 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| competition_math | AveragePass@1   | Level 4       |    30 |  0.2    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| competition_math | AveragePass@1   | Level 5       |    30 |  0      | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| competition_math | AveragePass@1   | OVERALL       |   150 |  0.18   | -       |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| gpqa             | AveragePass@1   | gpqa_extended |    30 |  0.3    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| gpqa             | AveragePass@1   | gpqa_main     |    30 |  0.2667 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| gpqa             | AveragePass@1   | gpqa_diamond  |    30 |  0.2333 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| gpqa             | AveragePass@1   | OVERALL       |    90 |  0.2667 | -       |
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Qwen3-4B| gsm8k            | AverageAccuracy | main          |    30 |  0.4667 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+ 
- This model: -
+---------+------------------+-----------------+---------------+-------+---------+---------+
| Model   | Dataset          | Metric          | Subset        |   Num |   Score | Cat.0   |
+=========+==================+=================+===============+=======+=========+=========+
|ThisModel| arc              | AverageAccuracy | ARC-Easy      |    30 |  0.9    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| arc              | AverageAccuracy | ARC-Challenge |    30 |  0.8    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| arc              | AverageAccuracy | OVERALL       |    60 |  0.85   | -       |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| competition_math | AveragePass@1   | Level 1       |    30 |  0.9    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| competition_math | AveragePass@1   | Level 2       |    30 |  0.9    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| competition_math | AveragePass@1   | Level 3       |    30 |  0.8    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| competition_math | AveragePass@1   | Level 4       |    30 |  0.7333 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| competition_math | AveragePass@1   | Level 5       |    30 |  0.4667 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| competition_math | AveragePass@1   | OVERALL       |   150 |  0.76   | -       |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| gpqa             | AveragePass@1   | gpqa_extended |    30 |  0.3333 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| gpqa             | AveragePass@1   | gpqa_main     |    30 |  0.3    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| gpqa             | AveragePass@1   | gpqa_diamond  |    30 |  0.3333 | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| gpqa             | AveragePass@1   | OVERALL       |    90 |  0.3222 | -       |
+---------+------------------+-----------------+---------------+-------+---------+---------+
|ThisModel| gsm8k            | AverageAccuracy | main          |    30 |  0.8    | default |
+---------+------------------+-----------------+---------------+-------+---------+---------+ 
- You could see that this model have better performence at math and inference. # Use This Model: -
  from transformers import AutoModelForCausalLM, AutoTokenizer,TextStreamer
  
  model_name = "wesjos/Qwen3-4B-math"
  tokenizer = AutoTokenizer.from_pretrained(model_name)
  model = AutoModelForCausalLM.from_pretrained(
      model_name,
      torch_dtype="auto",
      device_map="auto"
  )
  
  prompt = "设 f(x) 是一个定义在实数集上的可微函数,满足以下条件:f(0)=1对于所有实数 x有 f′(x)=2f(x)+3。求 f(x)的显式表达式。"
  messages = [
      {"role": "user", "content": prompt}
  ]
  
  text = tokenizer.apply_chat_template(
      messages,
      tokenize=False,
      add_generation_prompt=True,
      enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
  )
  
  model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
  
  text_streamer = TextStreamer(tokenizer)
  _ = model.generate(**model_inputs, streamer = text_streamer, max_new_tokens = 2048)