Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +24 -16
config.json +16 -1
model.safetensors +2 -2

README.md CHANGED Viewed

@@ -15,22 +15,22 @@ license: apache-2.0
 # qwen0.5b-tech-interview-test
 This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on mathematical reasoning tasks.
-It has been trained using [TRL](https://github.com/huggingface/trl) with LoRA.
 ## Model Details
 - **Base Model**: [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B)
-- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) followed by weight merging
 - **Task**: Mathematical reasoning (GSM8K benchmark)
 - **Training Framework**: TRL (Transformer Reinforcement Learning)
 ## Training Data
 The model was fine-tuned on a mixture of datasets:
-- **GSM8K** (40%): 400 samples from the GSM8K training set
-- **NuminaMath-CoT** (60%): 600 samples from the NuminaMath-CoT dataset
-Total training samples: 1,000
 ## Evaluation Results
@@ -38,39 +38,45 @@ Total training samples: 1,000
 | Metric | Method | Few-shot | Score | Std Error |
 |--------|--------|----------|-------|-----------|
-| exact_match | flexible-extract | 5 | **35.33%** | ±0.0132 |
-| exact_match | strict-match | 5 | **34.42%** | ±0.0131 |
-- **Baseline** (Qwen2.5-0.5B-Instruct): 34.42% (strict-match)
-- **Improvement**: Comparable performance achieved through fine-tuning
 ### Evaluation Details
 - Evaluation tool: [EleutherAI's lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
 - Model inference: vLLM
 - Test samples: 1,319
 - Generation settings: `temperature=0.0`, `do_sample=False`
 ## Training Procedure
 ### Training Hyperparameters
 - **Learning Rate**: 2e-5
-- **Training Steps**: 1,172 steps
-- **Training Epochs**: ~1 epoch
 - **Batch Size**: 1 (per device)
 - **Gradient Accumulation Steps**: 8
-- **LoRA Rank (r)**: 8
-- **LoRA Alpha**: 16
-- **Target Modules**: q_proj, k_proj, v_proj, o_proj
-- **Max Sequence Length**: 32768
 ### Training Process
 The model was trained using:
 - **Evaluation Strategy**: Steps (every 250 steps)
 - **Early Stopping**: Enabled with patience=3
 - **Best Model Selection**: Based on eval_loss
-- **Final Model**: Checkpoint at step 1,172
 ## Model Usage
@@ -144,6 +150,8 @@ outputs = model.generate([prompt], sampling_params)
   - Transformers: 4.57.1
   - PyTorch: 2.8.0
   - Datasets: 4.3.0
 ## Citation

 # qwen0.5b-tech-interview-test
 This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on mathematical reasoning tasks.
+It has been trained using [TRL](https://github.com/huggingface/trl) with QLoRA (Quantized LoRA).
 ## Model Details
 - **Base Model**: [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B)
+- **Fine-tuning Method**: QLoRA (Quantized LoRA) followed by weight merging
 - **Task**: Mathematical reasoning (GSM8K benchmark)
 - **Training Framework**: TRL (Transformer Reinforcement Learning)
 ## Training Data
 The model was fine-tuned on a mixture of datasets:
+- **GSM8K** (40%): 2,000 samples from the GSM8K training set
+- **NuminaMath-CoT** (60%): 3,000 samples from the NuminaMath-CoT dataset
+Total training samples: 5,000
 ## Evaluation Results
 | Metric | Method | Few-shot | Score | Std Error |
 |--------|--------|----------|-------|-----------|
+| exact_match | flexible-extract | 5 | **33.89%** | ±0.0130 |
+| exact_match | strict-match | 5 | **33.06%** | ±0.0130 |
+- **Baseline** (Qwen2.5-0.5B-Instruct): ~34.42% (strict-match)
+- **Note**: Results achieved through fine-tuning on a curated dataset mixture
 ### Evaluation Details
 - Evaluation tool: [EleutherAI's lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
 - Model inference: vLLM
 - Test samples: 1,319
 - Generation settings: `temperature=0.0`, `do_sample=False`
+- Evaluation method: Few-shot with 5 examples
 ## Training Procedure
 ### Training Hyperparameters
 - **Learning Rate**: 2e-5
+- **Training Steps**: Variable (with early stopping)
+- **Training Epochs**: Up to 2 epochs (early stopping enabled)
 - **Batch Size**: 1 (per device)
 - **Gradient Accumulation Steps**: 8
+- **LoRA Rank (r)**: 32
+- **LoRA Alpha**: 64
+- **LoRA Dropout**: 0.1
+- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+- **Max Sequence Length**: 2048
+- **Quantization**: 8-bit (QLoRA)
 ### Training Process
 The model was trained using:
+- **Training Framework**: TRL SFTTrainer with QLoRA
 - **Evaluation Strategy**: Steps (every 250 steps)
 - **Early Stopping**: Enabled with patience=3
 - **Best Model Selection**: Based on eval_loss
+- **Optimizer**: paged_adamw_8bit
+- **Learning Rate Schedule**: Cosine
+- **Warmup Ratio**: 0.15
 ## Model Usage
   - Transformers: 4.57.1
   - PyTorch: 2.8.0
   - Datasets: 4.3.0
+  - PEFT: (for LoRA/QLoRA support)
+  - BitsAndBytes: (for 8-bit quantization)
 ## Citation

config.json CHANGED Viewed

@@ -3,7 +3,7 @@
     "Qwen2ForCausalLM"
   ],
   "attention_dropout": 0.0,
-  "dtype": "float16",
   "eos_token_id": 151643,
   "hidden_act": "silu",
   "hidden_size": 896,
@@ -43,6 +43,21 @@
   "num_key_value_heads": 2,
   "pad_token_id": 151643,
   "pretraining_tp": 1,
   "rms_norm_eps": 1e-06,
   "rope_scaling": null,
   "rope_theta": 1000000.0,

     "Qwen2ForCausalLM"
   ],
   "attention_dropout": 0.0,
+  "dtype": "float32",
   "eos_token_id": 151643,
   "hidden_act": "silu",
   "hidden_size": 896,
   "num_key_value_heads": 2,
   "pad_token_id": 151643,
   "pretraining_tp": 1,
+  "quantization_config": {
+    "_load_in_4bit": false,
+    "_load_in_8bit": true,
+    "bnb_4bit_compute_dtype": "float32",
+    "bnb_4bit_quant_storage": "uint8",
+    "bnb_4bit_quant_type": "fp4",
+    "bnb_4bit_use_double_quant": false,
+    "llm_int8_enable_fp32_cpu_offload": false,
+    "llm_int8_has_fp16_weight": false,
+    "llm_int8_skip_modules": null,
+    "llm_int8_threshold": 6.0,
+    "load_in_4bit": false,
+    "load_in_8bit": true,
+    "quant_method": "bitsandbytes"
+  },
   "rms_norm_eps": 1e-06,
   "rope_scaling": null,
   "rope_theta": 1000000.0,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:92c864a88a899006eb5623a5a5aac6073cf47bd3491273dd27e14dc947e319f3
-size 988097536

 version https://git-lfs.github.com/spec/v1
+oid sha256:e76871c347d60b25939a8ec64d991016205fbcd7b0fea2c8b29236a19316c658
+size 903935912