| pipeline_tag: text-generation | |
| library_name: transformers | |
| license: apache-2.0 | |
| tags: | |
| - mixtral | |
| - moe | |
| - reasoning | |
| # Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks | |
| This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672). | |
| For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity) | |
| ## How to cite | |
| If you find our work helpful, please feel free to cite the paper. | |
| ```bibtex | |
| @article{nakamura2025optimalsparsitymixtureofexpertslanguage, | |
| title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks}, | |
| author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota}, | |
| year={2025}, | |
| eprint={2508.18672}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.LG}, | |
| url={https://arxiv.org/abs/2508.18672}, | |
| } | |
| ``` | |