|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: "Qwen/Qwen3-0.6B" |
|
|
tags: |
|
|
- text-generation |
|
|
- deepspeed |
|
|
- fine-tuned |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Qwen3-0.6B-v0.1 |
|
|
|
|
|
DeepSpeed-Chat์ผ๋ก ํ์ธํ๋๋ ์ธ์ด ๋ชจ๋ธ |
|
|
|
|
|
## Model Details |
|
|
|
|
|
์ด ๋ชจ๋ธ์ DeepSpeed-Chat์ ์ฌ์ฉํ์ฌ ํ์ธํ๋๋ ๋ชจ๋ธ์
๋๋ค. |
|
|
|
|
|
- **Base Model**: ๊ธฐ๋ณธ ๋ชจ๋ธ ์ ๋ณด๋ฅผ ์ฌ๊ธฐ์ ์ถ๊ฐํ์ธ์ |
|
|
- **Fine-tuning Method**: DeepSpeed-Chat |
|
|
- **Training Data**: ํ์ต ๋ฐ์ดํฐ ์ ๋ณด๋ฅผ ์ฌ๊ธฐ์ ์ถ๊ฐํ์ธ์ |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("mncai/Qwen3-0.6B-v0.1") |
|
|
model = AutoModelForCausalLM.from_pretrained("mncai/Qwen3-0.6B-v0.1") |
|
|
|
|
|
# ํ
์คํธ ์์ฑ |
|
|
input_text = "Your prompt here" |
|
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=100) |
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Training Framework**: DeepSpeed |
|
|
- **Training Script**: DeepSpeed-Chat Step 1 Supervised Fine-tuning |
|
|
- **Upload Date**: N/A |
|
|
|
|
|
## Limitations and Biases |
|
|
|
|
|
์ด ๋ชจ๋ธ์ ํ๊ณ์ ๊ณผ ํธํฅ์ฑ์ ๋ํ ์ ๋ณด๋ฅผ ์ฌ๊ธฐ์ ์ถ๊ฐํ์ธ์. |
|
|
|
|
|
## Citation |
|
|
|
|
|
DeepSpeed-Chat์ ์ฌ์ฉํ๋ค๋ฉด ๋ค์์ ์ธ์ฉํด์ฃผ์ธ์: |
|
|
|
|
|
``` |
|
|
@misc{deepspeed-chat, |
|
|
title={DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales}, |
|
|
author={Yuxiao Zhuang et al.}, |
|
|
year={2023}, |
|
|
url={https://github.com/microsoft/DeepSpeed} |
|
|
} |
|
|
``` |
|
|
|