|
|
--- |
|
|
library_name: transformers |
|
|
license: cc-by-sa-4.0 |
|
|
language: |
|
|
- en |
|
|
- ja |
|
|
base_model: |
|
|
- EQUES/JPharmatron-7B-base |
|
|
tags: |
|
|
- pharmacy |
|
|
- biology |
|
|
- chemistry |
|
|
- medical |
|
|
--- |
|
|
|
|
|
# JPharmatron-7B |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
JPharmatron-7B is a 7B large language model designed for pharmaceutical applications and researches. |
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
The JPharmatron-7B is continually pre-trained using 8.8B tokens from Japanese and English datasets, based on Qwen2.5-7B. Compared to the JPharmatron-7B-base model, JPharmatron-7B has enhanced chat capabilities, obtained from Qwen2.5-7B-Instruct's chat vector. |
|
|
|
|
|
- **Developed by:** EQUES Inc. |
|
|
- **Funded by [optional]:** [GENIAC Project](https://www.meti.go.jp/policy/mono_info_service/geniac/index.html) |
|
|
- **Model type:** Causal decoder-only |
|
|
- **Language(s) (NLP):** Japanese, English |
|
|
- **License:** CC-BY-SA-4.0 |
|
|
|
|
|
### Model Sources [optional] |
|
|
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
|
|
- **Repository:** https://github.com/EQUES-Inc/pharma-LLM-eval |
|
|
- **Paper [optional]:** [A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP](https://arxiv.org/abs/2505.16661) (IJCNLP-AACL 2025) |
|
|
|
|
|
## Uses |
|
|
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
|
|
This model is intended for applications in pharmaceutical paperwork and research. It is not validated for medical use or any other risk-sensitive use. |
|
|
|
|
|
|
|
|
## Evaluation |
|
|
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
|
|
We evaluated our model, JPharmatron-7B, with other general / domain-specific models of a similar size. |
|
|
|
|
|
### Testing Data |
|
|
|
|
|
<!-- This should link to a Dataset Card if possible. --> |
|
|
|
|
|
[JPharmaBench](https://huggingface.co/collections/EQUES/jpharmabench-680a34acfe96870e41d050d8) and two existing benchmarks (JMMLU (pharma) and IgakuQA) were used. |
|
|
|
|
|
### Results |
|
|
|
|
|
Compared to Meditron3-Qwen2.5-7B and Llama3.1-Swallow-8B-Instruct-v0.3, JPharmatron-7B achieved the highest score on all of the five benchmarks. |
|
|
|
|
|
 |
|
|
|
|
|
## Citation [optional] |
|
|
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
|
|
**This paper has been accepted to IJCNLP-AACL 2025. We will update the bibtex info below soon.** |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
``` |
|
|
@misc{ono2025japaneselanguagemodelnew, |
|
|
title={A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP}, |
|
|
author={Shinnosuke Ono and Issey Sukeda and Takuro Fujii and Kosei Buma and Shunsuke Sasaki}, |
|
|
year={2025}, |
|
|
eprint={2505.16661}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2505.16661}, |
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
## More Information [optional] |
|
|
|
|
|
See our preprint: [A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP](https://arxiv.org/abs/2505.16661). |
|
|
|
|
|
## Model Card Authors [optional] |
|
|
|
|
|
[@shinnosukeono](https://shinnosukeono.github.io/) |
|
|
|