Instructions to use tibok/baichuan-7B-chatml with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tibok/baichuan-7B-chatml with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="tibok/baichuan-7B-chatml", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("tibok/baichuan-7B-chatml", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use tibok/baichuan-7B-chatml with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "tibok/baichuan-7B-chatml" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tibok/baichuan-7B-chatml", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/tibok/baichuan-7B-chatml
- SGLang
How to use tibok/baichuan-7B-chatml with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "tibok/baichuan-7B-chatml" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tibok/baichuan-7B-chatml", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "tibok/baichuan-7B-chatml" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tibok/baichuan-7B-chatml", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use tibok/baichuan-7B-chatml with Docker Model Runner:
docker model run hf.co/tibok/baichuan-7B-chatml
license: apache-2.0
datasets:
- BelleGroup/train_0.5M_CN
language:
- en
- zh
tags:
- text-generation-inference
widget:
- text: |-
<|im_start|>user
请以『春天的北京』为题写一首诗歌
<|im_end|>
<|im_start|>assistant
example_title: generation zh
Baichuan 7B ChatML
介绍 Introduction
baichuan-7B-chatml 是支持多轮对话兼容于 ChatML 的模型。
模型基于 baichuan-7B 微调而成。
baichuan-7B-chatml 模型支持商用。但按照baichuan-7B的要求,如果将baichuan-7B衍生品用作商业用途,需要联系baichuan-7B 的许可方。
需要注意:在面对事实性知识任务时,模型可能会生成不正确的信息或者产生不稳定的输出(有时可以返回正确答案,有时不能)。
baichuan-7B-chatml is a model that supports multi-turn dialog and is compatible with ChatML.
The model is fine-tuned based on baichuan-7B.
baichuan-7B-chatml model supports commercial use. However, according to the requirements of baichuan-7B, if baichuan-7B derivatives are used for commercial purposes, you need to contact baichuan-7B。
Note: When dealing with factual knowledge tasks, it may generate incorrect information or unstable output (sometimes it can return the correct answer, sometimes not).
代码示例 Examples
模型在百川的基础上提供了对轮对话的函数供调用。
The model provides a function for multi-turn dialogs.
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("tibok/baichuan-7B-chatml", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("tibok/baichuan-7B-chatml", device_map="auto", trust_remote_code=True)
>>> response, history = model.chat(tokenizer, "请以『春天的北京』为题写一首诗歌", history=[])
春天的北京,
花开万丈,
春意盎然,
清风送暖。
<|im_end|>
>>> response, history = model.chat(tokenizer, "能不能再写一首关于香山的?", history=history)
>>> print(response)
香山之巅,
芳草连天。
清泉潺潺,
山峦绵绵。
<|im_end|>
更多细节 Details
- Dataset: BelleGroup/train_0.5M_CN
- steps: 13800
- batch_size: 8
- seq_len: 2048