FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training.
Yuichi Tateno PRO
hotchpotch
AI & ML interests
Information Retrieval with LLMs
Recent Activity
published
a dataset
39 minutes ago
hotchpotch/qa-context-relevance-multilingual-140k
updated
a dataset
39 minutes ago
hotchpotch/qa-context-relevance-multilingual-140k
published
a model
41 minutes ago
hotchpotch/query-context-pruner-multilingual-Qwen3-1.7B
Organizations
japanese-reranker
日本語rerankerシリーズ
-
hotchpotch/japanese-reranker-tiny-v2
Text Ranking • 29.4M • Updated • 536 • 6 -
hotchpotch/japanese-reranker-xsmall-v2
Text Ranking • 36.8M • Updated • 49.6k • 2 -
hotchpotch/japanese-reranker-small-v2
Text Ranking • 70.2M • Updated • 500 • 2 -
hotchpotch/japanese-reranker-base-v2
Text Ranking • 0.1B • Updated • 874 • 4
FineWeb2 Edu Japanese
FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training.
japanese-reranker
日本語rerankerシリーズ
-
hotchpotch/japanese-reranker-tiny-v2
Text Ranking • 29.4M • Updated • 536 • 6 -
hotchpotch/japanese-reranker-xsmall-v2
Text Ranking • 36.8M • Updated • 49.6k • 2 -
hotchpotch/japanese-reranker-small-v2
Text Ranking • 70.2M • Updated • 500 • 2 -
hotchpotch/japanese-reranker-base-v2
Text Ranking • 0.1B • Updated • 874 • 4
spaces
4
Runtime error
3
Japanese Splade Demo Streamlit
📉
Convert text to SPLADE token scores
Sleeping
TokenViz: AutoTokenizer Visualization Tool
🔍
Visualize the results of AutoTokenizer
Sleeping
Secon Dev Site Search
🐨
Running
5
Wikipedia Japanese Rag Search
😻
Ask questions about Wikipedia articles in Japanese
models
37
hotchpotch/query-context-pruner-multilingual-Qwen3-1.7B
2B
•
Updated
hotchpotch/query-context-pruner-multilingual-Qwen3-4B
4B
•
Updated
•
14
hotchpotch/japanese-reranker-small-v2
Text Ranking
•
70.2M
•
Updated
•
500
•
2
hotchpotch/japanese-reranker-base-v2
Text Ranking
•
0.1B
•
Updated
•
874
•
4
hotchpotch/japanese-reranker-xsmall-v2
Text Ranking
•
36.8M
•
Updated
•
49.6k
•
2
hotchpotch/japanese-reranker-tiny-v2
Text Ranking
•
29.4M
•
Updated
•
536
•
6
hotchpotch/japanese-reranker-cross-encoder-small-v1
Text Ranking
•
0.1B
•
Updated
•
8.85k
•
3
hotchpotch/japanese-reranker-cross-encoder-base-v1
Text Ranking
•
0.1B
•
Updated
•
790
•
1
hotchpotch/japanese-reranker-cross-encoder-large-v1
Text Ranking
•
0.3B
•
Updated
•
14.4k
•
16
hotchpotch/japanese-bge-reranker-v2-m3-v1
Text Ranking
•
0.6B
•
Updated
•
522
•
15
datasets
26
hotchpotch/qa-context-relevance-multilingual-140k
Viewer
•
Updated
•
143k
•
2
hotchpotch/lawqa_jp
Viewer
•
Updated
•
1.29k
•
35
hotchpotch/miracl-hf-unified
Viewer
•
Updated
•
106M
•
560
hotchpotch/JFWIR
Viewer
•
Updated
•
128M
•
100
•
4
hotchpotch/fineweb-2-edu-japanese
Viewer
•
Updated
•
262M
•
1.46k
•
20
hotchpotch/japanese-query-crafter-reasoning-80k
Viewer
•
Updated
•
83.3k
•
82
•
3
hotchpotch/tmp-5M-qa-small-tokens-cleaned
Viewer
•
Updated
•
5M
•
22
hotchpotch/japanese-qa-reasoning-100k
Viewer
•
Updated
•
106k
•
26
•
2
hotchpotch/fineweb-2-edu-japanese-noise-detect-raw
Viewer
•
Updated
•
64.2M
•
55
hotchpotch/fineweb-2-japanese-noise-spans
Viewer
•
Updated
•
344k
•
18