Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

tokyotech-llm

Team

university

swallow-llm

Activity Feed Request to join this org

AI & ML interests

None defined yet.

tokyotech-llm 's collections 14

Apache-2.0 Open High Quality Math Corpus

tokyotech-llm/swallow-math-v2

Viewer • Updated Nov 6, 2025 • 17.4M • 18.8k • 18

Llama-3.1-Swallow-v0.5

tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 2.96k • 8
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 32.9k • • 17

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-code

Viewer • Updated Jul 4, 2025 • 129M • 2.38k • 59
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0002500

Updated Jul 4, 2025 • 6
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0005000

8B • Updated Jul 4, 2025 • 7
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0007500

8B • Updated Jul 4, 2025 • 6

Llama-3.3-Swallow

tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4

Text Generation • 71B • Updated Jul 1, 2025 • 2.74k • • 12
tokyotech-llm/Llama-3.3-Swallow-70B-v0.4

Text Generation • 71B • Updated May 31, 2025 • 1.35k • 4
tokyotech-llm/edu-classifier

Text Classification • Updated Jan 30, 2025 • 93 • 13

Llama-3-Swallow

tokyotech-llm/Llama-3-Swallow-8B-v0.1

Text Generation • Updated Oct 8, 2024 • 264 • • 12
tokyotech-llm/Llama-3-Swallow-70B-v0.1

Text Generation • Updated Oct 8, 2024 • 45 • • 6
tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1

Text Generation • Updated Oct 8, 2024 • 8.69k • • 21
tokyotech-llm/Llama-3-Swallow-70B-Instruct-v0.1

Text Generation • 71B • Updated Oct 8, 2024 • 41 • • 7

Swallow-instruct

Swallow instruction tuning models

tokyotech-llm/Swallow-7b-instruct-v0.1

Text Generation • 7B • Updated Oct 8, 2024 • 346 • 3
tokyotech-llm/Swallow-13b-instruct-v0.1

Text Generation • 13B • Updated Oct 8, 2024 • 200 • 1
tokyotech-llm/Swallow-70b-instruct-v0.1

Text Generation • 69B • Updated Oct 8, 2024 • 24
tokyotech-llm/Swallow-70b-NVE-instruct-hf

Text Generation • 69B • Updated Oct 8, 2024 • 5 • 2

Swallow MX(Mixtral) models

tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

Text Generation • 47B • Updated Aug 17, 2024 • 15 • 29

Apache-2.0 Open High Quality Code Corpus

tokyotech-llm/swallow-code-v2

Viewer • Updated Nov 8, 2025 • 147M • 7.72k • 26

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-math

Viewer • Updated May 10, 2025 • 4.33M • 1.34k • 38
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500

8B • Updated May 7, 2025 • 5
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000

8B • Updated May 7, 2025 • 9
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500

8B • Updated May 7, 2025 • 8

Gemma-2-Swallow

tokyotech-llm/Gemma-2-Llama-Swallow-27b-pt-v0.1

Text Generation • 27B • Updated May 18, 2025 • 124 • 1
tokyotech-llm/Gemma-2-Llama-Swallow-9b-pt-v0.1

Text Generation • Updated May 18, 2025 • 356 • 1
tokyotech-llm/Gemma-2-Llama-Swallow-2b-pt-v0.1

Text Generation • Updated May 18, 2025 • 5.42k
tokyotech-llm/Gemma-2-Llama-Swallow-2b-it-v0.1

Text Generation • Updated May 18, 2025 • 719 • 4

Llama-3.1-Swallow

tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 32.9k • • 17
tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 2.96k • 8
tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3

Text Generation • 71B • Updated Apr 2, 2025 • 9.44k • 14
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

Text Generation • 8B • Updated Apr 2, 2025 • 2.57k • • 23

Continual Pre-Training from Llama 2

tokyotech-llm/Swallow-7b-hf

Text Generation • 7B • Updated Oct 8, 2024 • 1.12k • 17
tokyotech-llm/Swallow-7b-plus-hf

Text Generation • Updated Oct 8, 2024 • 87 • 8
tokyotech-llm/Swallow-13b-hf

Text Generation • Updated Oct 8, 2024 • 202 • 12
tokyotech-llm/Swallow-70b-hf

Text Generation • Updated Oct 8, 2024 • 68 • 10

Swallow MS(Mistral) models

tokyotech-llm/Swallow-MS-7b-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 32 • 28

Swallow-MS-instruct

tokyotech-llm/Swallow-MS-7b-instruct-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 174 • 14

Apache-2.0 Open High Quality Math Corpus

tokyotech-llm/swallow-math-v2

Viewer • Updated Nov 6, 2025 • 17.4M • 18.8k • 18

Apache-2.0 Open High Quality Code Corpus

tokyotech-llm/swallow-code-v2

Viewer • Updated Nov 8, 2025 • 147M • 7.72k • 26

Llama-3.1-Swallow-v0.5

tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 2.96k • 8
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 32.9k • • 17

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-math

Viewer • Updated May 10, 2025 • 4.33M • 1.34k • 38
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500

8B • Updated May 7, 2025 • 5
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000

8B • Updated May 7, 2025 • 9
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500

8B • Updated May 7, 2025 • 8

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-code

Viewer • Updated Jul 4, 2025 • 129M • 2.38k • 59
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0002500

Updated Jul 4, 2025 • 6
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0005000

8B • Updated Jul 4, 2025 • 7
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0007500

8B • Updated Jul 4, 2025 • 6

Gemma-2-Swallow

tokyotech-llm/Gemma-2-Llama-Swallow-27b-pt-v0.1

Text Generation • 27B • Updated May 18, 2025 • 124 • 1
tokyotech-llm/Gemma-2-Llama-Swallow-9b-pt-v0.1

Text Generation • Updated May 18, 2025 • 356 • 1
tokyotech-llm/Gemma-2-Llama-Swallow-2b-pt-v0.1

Text Generation • Updated May 18, 2025 • 5.42k
tokyotech-llm/Gemma-2-Llama-Swallow-2b-it-v0.1

Text Generation • Updated May 18, 2025 • 719 • 4

Llama-3.3-Swallow

tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4

Text Generation • 71B • Updated Jul 1, 2025 • 2.74k • • 12
tokyotech-llm/Llama-3.3-Swallow-70B-v0.4

Text Generation • 71B • Updated May 31, 2025 • 1.35k • 4
tokyotech-llm/edu-classifier

Text Classification • Updated Jan 30, 2025 • 93 • 13

Llama-3.1-Swallow

tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 32.9k • • 17
tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 2.96k • 8
tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3

Text Generation • 71B • Updated Apr 2, 2025 • 9.44k • 14
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

Text Generation • 8B • Updated Apr 2, 2025 • 2.57k • • 23

Llama-3-Swallow

tokyotech-llm/Llama-3-Swallow-8B-v0.1

Text Generation • Updated Oct 8, 2024 • 264 • • 12
tokyotech-llm/Llama-3-Swallow-70B-v0.1

Text Generation • Updated Oct 8, 2024 • 45 • • 6
tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1

Text Generation • Updated Oct 8, 2024 • 8.69k • • 21
tokyotech-llm/Llama-3-Swallow-70B-Instruct-v0.1

Text Generation • 71B • Updated Oct 8, 2024 • 41 • • 7

Continual Pre-Training from Llama 2

tokyotech-llm/Swallow-7b-hf

Text Generation • 7B • Updated Oct 8, 2024 • 1.12k • 17
tokyotech-llm/Swallow-7b-plus-hf

Text Generation • Updated Oct 8, 2024 • 87 • 8
tokyotech-llm/Swallow-13b-hf

Text Generation • Updated Oct 8, 2024 • 202 • 12
tokyotech-llm/Swallow-70b-hf

Text Generation • Updated Oct 8, 2024 • 68 • 10

Swallow-instruct

Swallow instruction tuning models

tokyotech-llm/Swallow-7b-instruct-v0.1

Text Generation • 7B • Updated Oct 8, 2024 • 346 • 3
tokyotech-llm/Swallow-13b-instruct-v0.1

Text Generation • 13B • Updated Oct 8, 2024 • 200 • 1
tokyotech-llm/Swallow-70b-instruct-v0.1

Text Generation • 69B • Updated Oct 8, 2024 • 24
tokyotech-llm/Swallow-70b-NVE-instruct-hf

Text Generation • 69B • Updated Oct 8, 2024 • 5 • 2

Swallow MS(Mistral) models

tokyotech-llm/Swallow-MS-7b-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 32 • 28

Swallow MX(Mixtral) models

tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

Text Generation • 47B • Updated Aug 17, 2024 • 15 • 29

Swallow-MS-instruct

tokyotech-llm/Swallow-MS-7b-instruct-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 174 • 14

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs