HumanLLMs
/

Human-Like-Qwen2.5-7B-Instruct

@@ -1,312 +1,318 @@
----
-license: apache-2.0
-tags:
-- axolotl
-- dpo
-- trl
-base_model: Qwen/Qwen2.5-7B-Instruct
-pipeline_tag: text-generation
-library_name: transformers
-model-index:
-- name: Humanish-Qwen2.5-7B-Instruct
-  results:
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: IFEval (0-Shot)
-      type: HuggingFaceH4/ifeval
-      args:
-        num_few_shot: 0
-    metrics:
-    - type: inst_level_strict_acc and prompt_level_strict_acc
-      value: 72.84
-      name: strict accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: BBH (3-Shot)
-      type: BBH
-      args:
-        num_few_shot: 3
-    metrics:
-    - type: acc_norm
-      value: 34.48
-      name: normalized accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: MATH Lvl 5 (4-Shot)
-      type: hendrycks/competition_math
-      args:
-        num_few_shot: 4
-    metrics:
-    - type: exact_match
-      value: 0
-      name: exact match
-    source:
-      url: >-
-        https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: GPQA (0-shot)
-      type: Idavidrein/gpqa
-      args:
-        num_few_shot: 0
-    metrics:
-    - type: acc_norm
-      value: 6.49
-      name: acc_norm
-    source:
-      url: >-
-        https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: MuSR (0-shot)
-      type: TAUR-Lab/MuSR
-      args:
-        num_few_shot: 0
-    metrics:
-    - type: acc_norm
-      value: 8.42
-      name: acc_norm
-    source:
-      url: >-
-        https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: MMLU-PRO (5-shot)
-      type: TIGER-Lab/MMLU-Pro
-      config: main
-      split: test
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 37.76
-      name: accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-      name: Open LLM Leaderboard
-datasets:
-- HumanLLMs/Human-Like-DPO-Dataset
-language:
-- en
----
-<div align="center">
-  <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
-  <h1>Enhancing Human-Like Responses in Large Language Models</h1>
-</div>
-<p align="center">
-  &nbsp&nbsp | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>&nbsp&nbsp |
-  &nbsp&nbsp 📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>&nbsp&nbsp |
-  &nbsp&nbsp 📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>&nbsp&nbsp |
-</p>
-# 🚀 Human-Like-Qwen2.5-7B-Instruct
-This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct), specifically optimized to generate more human-like and conversational responses.
-The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2106.09685) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
-The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
-# 🛠️ Training Configuration
-- **Base Model:** Qwen2.5-7B-Instruct
-- **Framework:** Axolotl v0.4.1
-- **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
-- **Training Time:** ~2 hours 15 minutes
-- **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
-<details><summary>See axolotl config</summary>
-axolotl version: `0.4.1`
-```yaml
-base_model: Qwen/Qwen2.5-7B-Instruct
-model_type: AutoModalForCausalLM
-tokenizer_type: AutoTokenizer
-trust_remote_code: true
-load_in_8bit: true
-load_in_4bit: false
-strict: false
-chat_template: chatml
-rl: dpo
-datasets:
-  - path: HumanLLMs/humanish-dpo-project
-    type: chatml.prompt_pairs
-    chat_template: chatml
-dataset_prepared_path:
-val_set_size: 0.05
-output_dir: ./humanish-qwen2.5-7b-instruct
-sequence_len: 8192
-sample_packing: false
-pad_to_sequence_len: true
-adapter: lora
-lora_model_dir:
-lora_r: 8
-lora_alpha: 4
-lora_dropout: 0.05
-lora_target_linear: true
-lora_fan_in_fan_out:
-wandb_project: Humanish-DPO
-wandb_entity:
-wandb_watch:
-wandb_name:
-wandb_log_model:
-hub_model_id: HumanLLMs/Humanish-Qwen2.5-7B-Instruct
-gradient_accumulation_steps: 8
-micro_batch_size: 2
-num_epochs: 1
-optimizer: adamw_bnb_8bit
-lr_scheduler: cosine
-learning_rate: 0.0002
-train_on_inputs: false
-group_by_length: false
-bf16: auto
-fp16:
-tf32: false
-gradient_checkpointing: true
-early_stopping_patience:
-resume_from_checkpoint:
-local_rank:
-logging_steps: 1
-xformers_attention:
-flash_attention: true
-s2_attention:
-warmup_steps: 10
-evals_per_epoch: 2
-eval_table_size:
-eval_max_new_tokens: 128
-saves_per_epoch: 1
-debug:
-deepspeed:
-weight_decay: 0.0
-fsdp:
-fsdp_config:
-save_safetensors: true
-```
-</details><br>
-# 💬 Prompt Template
-You can use ChatML prompt template while using the model:
-### ChatML
-```
-<|im_start|>system
-{system}<|im_end|>
-<|im_start|>user
-{user}<|im_end|>
-<|im_start|>assistant
-{asistant}<|im_end|>
-```
-This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
-`tokenizer.apply_chat_template()` method:
-```python
-messages = [
-    {"role": "system", "content": "You are helpful AI asistant."},
-    {"role": "user", "content": "Hello!"}
-]
-gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
-model.generate(**gen_input)
-```
-# 🤖 Models
-|         Model         |                               Download                                 |
-|:---------------------:|:-----------------------------------------------------------------------:|
-| Human-Like-Llama-3-8B-Instruct  |  🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct)  |
-| Human-Like-Qwen-2.5-7B-Instruct  | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct)  |
-| Human-Like-Mistral-Nemo-Instruct  | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
-# 🔄 Quantizationed versions
-## GGUF [@bartowski](https://huggingface.co/bartowski)
-- https://huggingface.co/bartowski/Human-Like-LLama3-8B-Instruct-GGUF
-- https://huggingface.co/bartowski/Human-Like-Qwen2.5-7B-Instruct-GGUF
-- https://huggingface.co/bartowski/Human-Like-Mistral-Nemo-Instruct-2407-GGUF
-# 🎯 Benchmark Results
-| **Group**                      | **Model**                      | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
-|--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
-| **Llama Models**               | Human-Like-Llama-3-8B-Instruct | 22.37       | **64.97**  | 28.01   | 8.45           | 0.78     | **2.00** | 30.01        |
-|                                | Llama-3-8B-Instruct            | 23.57       | 74.08      | 28.24   | 8.68           | 1.23     | 1.60     | 29.60        |
-|                                | *Difference (Human-Like)*      | -1.20       | **-9.11**  | -0.23   | -0.23          | -0.45    | +0.40    | +0.41        |
-| **Qwen Models**                | Human-Like-Qwen-2.5-7B-Instruct | 26.66      | 72.84      | 34.48   | 0.00           | 6.49     | 8.42     | 37.76        |
-|                                | Qwen-2.5-7B-Instruct           | 26.86       | 75.85      | 34.89   | 0.00           | 5.48     | 8.45     | 36.52        |
-|                                | *Difference (Human-Like)*      | -0.20       | -3.01      | -0.41   | 0.00           | **+1.01**| -0.03    | **+1.24**    |
-| **Mistral Models**             | Human-Like-Mistral-Nemo-Instruct | 22.88     | **54.51**  | 32.70   | 7.62           | 5.03     | 9.39     | 28.00        |
-|                                | Mistral-Nemo-Instruct          | 23.53       | 63.80      | 29.68   | 5.89           | 5.37     | 8.48     | 27.97        |
-|                                | *Difference (Human-Like)*      | -0.65       | **-9.29**  | **+3.02**| **+1.73**      | -0.34    | +0.91    | +0.03        |
-# 📊 Dataset
-The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
-- **Human-like responses:** Natural, conversational answers mimicking human dialogue.
-- **Formal responses:** Structured and precise answers with a more formal tone.
-The dataset has been open-sourced and is available at:
-- 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
-More details on the dataset creation process can be found in the accompanying research paper.
-# 📝 Citation
-```
-@misc{çalık2025enhancinghumanlikeresponseslarge,
-      title={Enhancing Human-Like Responses in Large Language Models},
-      author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
-      year={2025},
-      eprint={2501.05032},
-      archivePrefix={arXiv},
-      primaryClass={cs.CL},
-      url={https://arxiv.org/abs/2501.05032},
-}
 ```

+---
+license: apache-2.0
+tags:
+- axolotl
+- dpo
+- trl
+base_model: Qwen/Qwen2.5-7B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+datasets:
+- HumanLLMs/Human-Like-DPO-Dataset
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+model-index:
+- name: Humanish-Qwen2.5-7B-Instruct
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: IFEval (0-Shot)
+      type: HuggingFaceH4/ifeval
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: inst_level_strict_acc and prompt_level_strict_acc
+      value: 72.84
+      name: strict accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: BBH (3-Shot)
+      type: BBH
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc_norm
+      value: 34.48
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MATH Lvl 5 (4-Shot)
+      type: hendrycks/competition_math
+      args:
+        num_few_shot: 4
+    metrics:
+    - type: exact_match
+      value: 0
+      name: exact match
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GPQA (0-shot)
+      type: Idavidrein/gpqa
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc_norm
+      value: 6.49
+      name: acc_norm
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MuSR (0-shot)
+      type: TAUR-Lab/MuSR
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc_norm
+      value: 8.42
+      name: acc_norm
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU-PRO (5-shot)
+      type: TIGER-Lab/MMLU-Pro
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 37.76
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+      name: Open LLM Leaderboard
+---
+<div align="center">
+  <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
+  <h1>Enhancing Human-Like Responses in Large Language Models</h1>
+</div>
+<p align="center">
+  &nbsp&nbsp | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>&nbsp&nbsp |
+  &nbsp&nbsp 📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>&nbsp&nbsp |
+  &nbsp&nbsp 📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>&nbsp&nbsp |
+</p>
+# 🚀 Human-Like-Qwen2.5-7B-Instruct
+This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct), specifically optimized to generate more human-like and conversational responses.
+The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2106.09685) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
+The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
+# 🛠️ Training Configuration
+- **Base Model:** Qwen2.5-7B-Instruct
+- **Framework:** Axolotl v0.4.1
+- **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
+- **Training Time:** ~2 hours 15 minutes
+- **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
+<details><summary>See axolotl config</summary>
+axolotl version: `0.4.1`
+```yaml
+base_model: Qwen/Qwen2.5-7B-Instruct
+model_type: AutoModalForCausalLM
+tokenizer_type: AutoTokenizer
+trust_remote_code: true
+load_in_8bit: true
+load_in_4bit: false
+strict: false
+chat_template: chatml
+rl: dpo
+datasets:
+  - path: HumanLLMs/humanish-dpo-project
+    type: chatml.prompt_pairs
+    chat_template: chatml
+dataset_prepared_path:
+val_set_size: 0.05
+output_dir: ./humanish-qwen2.5-7b-instruct
+sequence_len: 8192
+sample_packing: false
+pad_to_sequence_len: true
+adapter: lora
+lora_model_dir:
+lora_r: 8
+lora_alpha: 4
+lora_dropout: 0.05
+lora_target_linear: true
+lora_fan_in_fan_out:
+wandb_project: Humanish-DPO
+wandb_entity:
+wandb_watch:
+wandb_name:
+wandb_log_model:
+hub_model_id: HumanLLMs/Humanish-Qwen2.5-7B-Instruct
+gradient_accumulation_steps: 8
+micro_batch_size: 2
+num_epochs: 1
+optimizer: adamw_bnb_8bit
+lr_scheduler: cosine
+learning_rate: 0.0002
+train_on_inputs: false
+group_by_length: false
+bf16: auto
+fp16:
+tf32: false
+gradient_checkpointing: true
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+s2_attention:
+warmup_steps: 10
+evals_per_epoch: 2
+eval_table_size:
+eval_max_new_tokens: 128
+saves_per_epoch: 1
+debug:
+deepspeed:
+weight_decay: 0.0
+fsdp:
+fsdp_config:
+save_safetensors: true
+```
+</details><br>
+# 💬 Prompt Template
+You can use ChatML prompt template while using the model:
+### ChatML
+```
+<|im_start|>system
+{system}<|im_end|>
+<|im_start|>user
+{user}<|im_end|>
+<|im_start|>assistant
+{asistant}<|im_end|>
+```
+This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
+`tokenizer.apply_chat_template()` method:
+```python
+messages = [
+    {"role": "system", "content": "You are helpful AI asistant."},
+    {"role": "user", "content": "Hello!"}
+]
+gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
+model.generate(**gen_input)
+```
+# 🤖 Models
+|         Model         |                               Download                                 |
+|:---------------------:|:-----------------------------------------------------------------------:|
+| Human-Like-Llama-3-8B-Instruct  |  🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct)  |
+| Human-Like-Qwen-2.5-7B-Instruct  | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct)  |
+| Human-Like-Mistral-Nemo-Instruct  | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
+# 🔄 Quantizationed versions
+## GGUF [@bartowski](https://huggingface.co/bartowski)
+- https://huggingface.co/bartowski/Human-Like-LLama3-8B-Instruct-GGUF
+- https://huggingface.co/bartowski/Human-Like-Qwen2.5-7B-Instruct-GGUF
+- https://huggingface.co/bartowski/Human-Like-Mistral-Nemo-Instruct-2407-GGUF
+# 🎯 Benchmark Results
+| **Group**                      | **Model**                      | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
+|--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
+| **Llama Models**               | Human-Like-Llama-3-8B-Instruct | 22.37       | **64.97**  | 28.01   | 8.45           | 0.78     | **2.00** | 30.01        |
+|                                | Llama-3-8B-Instruct            | 23.57       | 74.08      | 28.24   | 8.68           | 1.23     | 1.60     | 29.60        |
+|                                | *Difference (Human-Like)*      | -1.20       | **-9.11**  | -0.23   | -0.23          | -0.45    | +0.40    | +0.41        |
+| **Qwen Models**                | Human-Like-Qwen-2.5-7B-Instruct | 26.66      | 72.84      | 34.48   | 0.00           | 6.49     | 8.42     | 37.76        |
+|                                | Qwen-2.5-7B-Instruct           | 26.86       | 75.85      | 34.89   | 0.00           | 5.48     | 8.45     | 36.52        |
+|                                | *Difference (Human-Like)*      | -0.20       | -3.01      | -0.41   | 0.00           | **+1.01**| -0.03    | **+1.24**    |
+| **Mistral Models**             | Human-Like-Mistral-Nemo-Instruct | 22.88     | **54.51**  | 32.70   | 7.62           | 5.03     | 9.39     | 28.00        |
+|                                | Mistral-Nemo-Instruct          | 23.53       | 63.80      | 29.68   | 5.89           | 5.37     | 8.48     | 27.97        |
+|                                | *Difference (Human-Like)*      | -0.65       | **-9.29**  | **+3.02**| **+1.73**      | -0.34    | +0.91    | +0.03        |
+# 📊 Dataset
+The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
+- **Human-like responses:** Natural, conversational answers mimicking human dialogue.
+- **Formal responses:** Structured and precise answers with a more formal tone.
+The dataset has been open-sourced and is available at:
+- 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
+More details on the dataset creation process can be found in the accompanying research paper.
+# 📝 Citation
+```
+@misc{çalık2025enhancinghumanlikeresponseslarge,
+      title={Enhancing Human-Like Responses in Large Language Models},
+      author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
+      year={2025},
+      eprint={2501.05032},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2501.05032},
+}
 ```