Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,68 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: other
|
| 3 |
-
license_name: openmdw
|
| 4 |
-
license_link: LICENSE
|
| 5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: openmdw
|
| 4 |
+
license_link: LICENSE
|
| 5 |
+
---
|
| 6 |
+
# Seed-X-RM-7B
|
| 7 |
+
<a href="https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/Technical_Report.pdf">
|
| 8 |
+
<img src="https://img.shields.io/badge/Seed--X-Report-blue"></a>
|
| 9 |
+
<a href="https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B">
|
| 10 |
+
<img src="https://img.shields.io/badge/Seed--X-Hugging Face-brightgreen"></a>
|
| 11 |
+
<a href="https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/LICENSE.openmdw">
|
| 12 |
+
<img src="https://img.shields.io/badge/License-OpenMDW-yellow"></a>
|
| 13 |
+
|
| 14 |
+
## Introduction
|
| 15 |
+
We are excited to introduce **Seed-X**, a powerful open-source multilingual translation language model series, including instruction and reasoning models, with 7B parameters pushing the boundaries of translation capabilities.
|
| 16 |
+
We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications:
|
| 17 |
+
* **Exceptional translation capabilities**: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics.
|
| 18 |
+
* **Deployment and inference-friendly**: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference.
|
| 19 |
+
* **Broad domain coverage**: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment.
|
| 20 |
+

|
| 21 |
+
|
| 22 |
+
This repo contains the **Seed-X-RM** model, with the following features:
|
| 23 |
+
* Type: Causal language models
|
| 24 |
+
* Training Stage: Pretraining & Post-training
|
| 25 |
+
* Data Source: Human preference data on multilingual translation
|
| 26 |
+
* Support: Evaluating translation betweeen 28 languages
|
| 27 |
+
|
| 28 |
+
| Languages | Abbr. | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. |
|
| 29 |
+
| ----------- | ----------- |-----------|-----------|-----------|-----------| -----------|-----------|
|
| 30 |
+
|Arabic | ar |French | fr | Malay | ms | Russian | ru |
|
| 31 |
+
|Czech | cs |Croatian | hr | Norwegian Bokmal | nb | Swedish | sv |
|
| 32 |
+
|Danish | da |Hungarian | hu | Dutch | nl | Thai | th |
|
| 33 |
+
|German | de |Indonesian | id | Norwegian | no | Turkish | tr |
|
| 34 |
+
|English | en |Italian | it | Polish | pl | Ukrainian | uk |
|
| 35 |
+
|Spanish | es |Japanese | ja | Portuguese | pt | Vietnamese | vi |
|
| 36 |
+
|Finnish | fi |Korean | ko | Romanian | ro | Chinese | zh |
|
| 37 |
+
|
| 38 |
+
## Model Downloads
|
| 39 |
+
| Model Name | Description | Download |
|
| 40 |
+
| ----------- | ----------- |-----------
|
| 41 |
+
| Seed-X-Instruct | Instruction-tuned for alignment with user intent. |🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-Instruct-7B)|
|
| 42 |
+
| Seed-X-PPO | RL trained to boost translation capabilities. | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B)|
|
| 43 |
+
| 👉 **Seed-X-RM** | Reward model to evaluate the quality of translation.| 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B)|
|
| 44 |
+
|
| 45 |
+
## Quickstart
|
| 46 |
+
Seed-X-RM assigns a reward score to the given translation with the same prompt format as Seed-X-PPO.
|
| 47 |
+
|
| 48 |
+
## Evaluation
|
| 49 |
+
We evaluated Seed-X on a diverse set of translation benchmarks, including FLORES-200, WMT-25, and a publicly released [challenge set](https://github.com/ByteDance-Seed/Seed-X-7B/tree/main/challenge_set) accompanied by human evaluations.
|
| 50 |
+

|
| 51 |
+
For detailed benchmark results and analysis, please refer to our [Technical Report](https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/Technical_Report.pdf).
|
| 52 |
+
|
| 53 |
+
## License
|
| 54 |
+
This project is licensed under OpenMDW. See the [LICENSE](https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/LICENSE.openmdw) flie for details.
|
| 55 |
+
|
| 56 |
+
## Citation
|
| 57 |
+
If you find Seed-X useful for your research and applications, feel free to give us a star ⭐ or cite us using:
|
| 58 |
+
```bibtex
|
| 59 |
+
@Article{XXX,
|
| 60 |
+
title={XXXXXXXXXXX},
|
| 61 |
+
author={XXX,XXX,XXX,XXX},
|
| 62 |
+
year={2025},
|
| 63 |
+
eprint={XXXX.XXXXX},
|
| 64 |
+
archivePrefix={arXiv},
|
| 65 |
+
primaryClass={cs.XX}
|
| 66 |
+
}
|
| 67 |
+
```
|
| 68 |
+
We will soon publish our technical report on Arxiv.
|