Upload folder using huggingface_hub
Browse files- README.md +24 -16
- config.json +16 -1
- model.safetensors +2 -2
README.md
CHANGED
|
@@ -15,22 +15,22 @@ license: apache-2.0
|
|
| 15 |
# qwen0.5b-tech-interview-test
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on mathematical reasoning tasks.
|
| 18 |
-
It has been trained using [TRL](https://github.com/huggingface/trl) with LoRA.
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
- **Base Model**: [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B)
|
| 23 |
-
- **Fine-tuning Method**:
|
| 24 |
- **Task**: Mathematical reasoning (GSM8K benchmark)
|
| 25 |
- **Training Framework**: TRL (Transformer Reinforcement Learning)
|
| 26 |
|
| 27 |
## Training Data
|
| 28 |
|
| 29 |
The model was fine-tuned on a mixture of datasets:
|
| 30 |
-
- **GSM8K** (40%):
|
| 31 |
-
- **NuminaMath-CoT** (60%):
|
| 32 |
|
| 33 |
-
Total training samples:
|
| 34 |
|
| 35 |
## Evaluation Results
|
| 36 |
|
|
@@ -38,39 +38,45 @@ Total training samples: 1,000
|
|
| 38 |
|
| 39 |
| Metric | Method | Few-shot | Score | Std Error |
|
| 40 |
|--------|--------|----------|-------|-----------|
|
| 41 |
-
| exact_match | flexible-extract | 5 | **
|
| 42 |
-
| exact_match | strict-match | 5 | **
|
| 43 |
|
| 44 |
-
- **Baseline** (Qwen2.5-0.5B-Instruct): 34.42% (strict-match)
|
| 45 |
-
- **
|
| 46 |
|
| 47 |
### Evaluation Details
|
| 48 |
- Evaluation tool: [EleutherAI's lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
|
| 49 |
- Model inference: vLLM
|
| 50 |
- Test samples: 1,319
|
| 51 |
- Generation settings: `temperature=0.0`, `do_sample=False`
|
|
|
|
| 52 |
|
| 53 |
## Training Procedure
|
| 54 |
|
| 55 |
### Training Hyperparameters
|
| 56 |
|
| 57 |
- **Learning Rate**: 2e-5
|
| 58 |
-
- **Training Steps**:
|
| 59 |
-
- **Training Epochs**:
|
| 60 |
- **Batch Size**: 1 (per device)
|
| 61 |
- **Gradient Accumulation Steps**: 8
|
| 62 |
-
- **LoRA Rank (r)**:
|
| 63 |
-
- **LoRA Alpha**:
|
| 64 |
-
- **
|
| 65 |
-
- **
|
|
|
|
|
|
|
| 66 |
|
| 67 |
### Training Process
|
| 68 |
|
| 69 |
The model was trained using:
|
|
|
|
| 70 |
- **Evaluation Strategy**: Steps (every 250 steps)
|
| 71 |
- **Early Stopping**: Enabled with patience=3
|
| 72 |
- **Best Model Selection**: Based on eval_loss
|
| 73 |
-
- **
|
|
|
|
|
|
|
| 74 |
|
| 75 |
## Model Usage
|
| 76 |
|
|
@@ -144,6 +150,8 @@ outputs = model.generate([prompt], sampling_params)
|
|
| 144 |
- Transformers: 4.57.1
|
| 145 |
- PyTorch: 2.8.0
|
| 146 |
- Datasets: 4.3.0
|
|
|
|
|
|
|
| 147 |
|
| 148 |
## Citation
|
| 149 |
|
|
|
|
| 15 |
# qwen0.5b-tech-interview-test
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on mathematical reasoning tasks.
|
| 18 |
+
It has been trained using [TRL](https://github.com/huggingface/trl) with QLoRA (Quantized LoRA).
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
- **Base Model**: [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B)
|
| 23 |
+
- **Fine-tuning Method**: QLoRA (Quantized LoRA) followed by weight merging
|
| 24 |
- **Task**: Mathematical reasoning (GSM8K benchmark)
|
| 25 |
- **Training Framework**: TRL (Transformer Reinforcement Learning)
|
| 26 |
|
| 27 |
## Training Data
|
| 28 |
|
| 29 |
The model was fine-tuned on a mixture of datasets:
|
| 30 |
+
- **GSM8K** (40%): 2,000 samples from the GSM8K training set
|
| 31 |
+
- **NuminaMath-CoT** (60%): 3,000 samples from the NuminaMath-CoT dataset
|
| 32 |
|
| 33 |
+
Total training samples: 5,000
|
| 34 |
|
| 35 |
## Evaluation Results
|
| 36 |
|
|
|
|
| 38 |
|
| 39 |
| Metric | Method | Few-shot | Score | Std Error |
|
| 40 |
|--------|--------|----------|-------|-----------|
|
| 41 |
+
| exact_match | flexible-extract | 5 | **33.89%** | ±0.0130 |
|
| 42 |
+
| exact_match | strict-match | 5 | **33.06%** | ±0.0130 |
|
| 43 |
|
| 44 |
+
- **Baseline** (Qwen2.5-0.5B-Instruct): ~34.42% (strict-match)
|
| 45 |
+
- **Note**: Results achieved through fine-tuning on a curated dataset mixture
|
| 46 |
|
| 47 |
### Evaluation Details
|
| 48 |
- Evaluation tool: [EleutherAI's lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
|
| 49 |
- Model inference: vLLM
|
| 50 |
- Test samples: 1,319
|
| 51 |
- Generation settings: `temperature=0.0`, `do_sample=False`
|
| 52 |
+
- Evaluation method: Few-shot with 5 examples
|
| 53 |
|
| 54 |
## Training Procedure
|
| 55 |
|
| 56 |
### Training Hyperparameters
|
| 57 |
|
| 58 |
- **Learning Rate**: 2e-5
|
| 59 |
+
- **Training Steps**: Variable (with early stopping)
|
| 60 |
+
- **Training Epochs**: Up to 2 epochs (early stopping enabled)
|
| 61 |
- **Batch Size**: 1 (per device)
|
| 62 |
- **Gradient Accumulation Steps**: 8
|
| 63 |
+
- **LoRA Rank (r)**: 32
|
| 64 |
+
- **LoRA Alpha**: 64
|
| 65 |
+
- **LoRA Dropout**: 0.1
|
| 66 |
+
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
| 67 |
+
- **Max Sequence Length**: 2048
|
| 68 |
+
- **Quantization**: 8-bit (QLoRA)
|
| 69 |
|
| 70 |
### Training Process
|
| 71 |
|
| 72 |
The model was trained using:
|
| 73 |
+
- **Training Framework**: TRL SFTTrainer with QLoRA
|
| 74 |
- **Evaluation Strategy**: Steps (every 250 steps)
|
| 75 |
- **Early Stopping**: Enabled with patience=3
|
| 76 |
- **Best Model Selection**: Based on eval_loss
|
| 77 |
+
- **Optimizer**: paged_adamw_8bit
|
| 78 |
+
- **Learning Rate Schedule**: Cosine
|
| 79 |
+
- **Warmup Ratio**: 0.15
|
| 80 |
|
| 81 |
## Model Usage
|
| 82 |
|
|
|
|
| 150 |
- Transformers: 4.57.1
|
| 151 |
- PyTorch: 2.8.0
|
| 152 |
- Datasets: 4.3.0
|
| 153 |
+
- PEFT: (for LoRA/QLoRA support)
|
| 154 |
+
- BitsAndBytes: (for 8-bit quantization)
|
| 155 |
|
| 156 |
## Citation
|
| 157 |
|
config.json
CHANGED
|
@@ -3,7 +3,7 @@
|
|
| 3 |
"Qwen2ForCausalLM"
|
| 4 |
],
|
| 5 |
"attention_dropout": 0.0,
|
| 6 |
-
"dtype": "
|
| 7 |
"eos_token_id": 151643,
|
| 8 |
"hidden_act": "silu",
|
| 9 |
"hidden_size": 896,
|
|
@@ -43,6 +43,21 @@
|
|
| 43 |
"num_key_value_heads": 2,
|
| 44 |
"pad_token_id": 151643,
|
| 45 |
"pretraining_tp": 1,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
"rms_norm_eps": 1e-06,
|
| 47 |
"rope_scaling": null,
|
| 48 |
"rope_theta": 1000000.0,
|
|
|
|
| 3 |
"Qwen2ForCausalLM"
|
| 4 |
],
|
| 5 |
"attention_dropout": 0.0,
|
| 6 |
+
"dtype": "float32",
|
| 7 |
"eos_token_id": 151643,
|
| 8 |
"hidden_act": "silu",
|
| 9 |
"hidden_size": 896,
|
|
|
|
| 43 |
"num_key_value_heads": 2,
|
| 44 |
"pad_token_id": 151643,
|
| 45 |
"pretraining_tp": 1,
|
| 46 |
+
"quantization_config": {
|
| 47 |
+
"_load_in_4bit": false,
|
| 48 |
+
"_load_in_8bit": true,
|
| 49 |
+
"bnb_4bit_compute_dtype": "float32",
|
| 50 |
+
"bnb_4bit_quant_storage": "uint8",
|
| 51 |
+
"bnb_4bit_quant_type": "fp4",
|
| 52 |
+
"bnb_4bit_use_double_quant": false,
|
| 53 |
+
"llm_int8_enable_fp32_cpu_offload": false,
|
| 54 |
+
"llm_int8_has_fp16_weight": false,
|
| 55 |
+
"llm_int8_skip_modules": null,
|
| 56 |
+
"llm_int8_threshold": 6.0,
|
| 57 |
+
"load_in_4bit": false,
|
| 58 |
+
"load_in_8bit": true,
|
| 59 |
+
"quant_method": "bitsandbytes"
|
| 60 |
+
},
|
| 61 |
"rms_norm_eps": 1e-06,
|
| 62 |
"rope_scaling": null,
|
| 63 |
"rope_theta": 1000000.0,
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e76871c347d60b25939a8ec64d991016205fbcd7b0fea2c8b29236a19316c658
|
| 3 |
+
size 903935912
|