Qwen3-0.6B-v0.1

DeepSpeed-Chat์œผ๋กœ ํŒŒ์ธํŠœ๋‹๋œ ์–ธ์–ด ๋ชจ๋ธ

Model Details

์ด ๋ชจ๋ธ์€ DeepSpeed-Chat์„ ์‚ฌ์šฉํ•˜์—ฌ ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

  • Base Model: ๊ธฐ๋ณธ ๋ชจ๋ธ ์ •๋ณด๋ฅผ ์—ฌ๊ธฐ์— ์ถ”๊ฐ€ํ•˜์„ธ์š”
  • Fine-tuning Method: DeepSpeed-Chat
  • Training Data: ํ•™์Šต ๋ฐ์ดํ„ฐ ์ •๋ณด๋ฅผ ์—ฌ๊ธฐ์— ์ถ”๊ฐ€ํ•˜์„ธ์š”

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mncai/Qwen3-0.6B-v0.1")
model = AutoModelForCausalLM.from_pretrained("mncai/Qwen3-0.6B-v0.1")

# ํ…์ŠคํŠธ ์ƒ์„ฑ
input_text = "Your prompt here"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Training Details

  • Training Framework: DeepSpeed
  • Training Script: DeepSpeed-Chat Step 1 Supervised Fine-tuning
  • Upload Date: N/A

Limitations and Biases

์ด ๋ชจ๋ธ์˜ ํ•œ๊ณ„์ ๊ณผ ํŽธํ–ฅ์„ฑ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์—ฌ๊ธฐ์— ์ถ”๊ฐ€ํ•˜์„ธ์š”.

Citation

DeepSpeed-Chat์„ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด ๋‹ค์Œ์„ ์ธ์šฉํ•ด์ฃผ์„ธ์š”:

@misc{deepspeed-chat,
  title={DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales},
  author={Yuxiao Zhuang et al.},
  year={2023},
  url={https://github.com/microsoft/DeepSpeed}
}
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mncai/Qwen3-0.6B-v0.1

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(347)
this model