GUIJIN SON's picture

GUIJIN SON PRO

amphora

·

https://guijinson.github.io/

AI & ML interests

None yet

Recent Activity

updated a dataset 2 days ago

amphora/math-utility-db

published a dataset 2 days ago

amphora/math-utility-db

updated a dataset 2 days ago

amphora/suhak-db

View all activity

Organizations

upvoted a paper about 2 months ago

Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math

Paper • 2510.13744 • Published Oct 15, 2025 • 5

upvoted a paper 2 months ago

MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

Paper • 2510.25867 • Published Oct 29, 2025 • 6

upvoted a paper 3 months ago

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 26

upvoted a collection 3 months ago

KOREAson-0831

6 items • Updated Oct 9, 2025 • 4

upvoted a paper 5 months ago

From KMMLU-Redux to KMMLU-Pro: A Professional Korean Benchmark Suite for LLM Evaluation

Paper • 2507.08924 • Published Jul 11, 2025 • 17

upvoted a paper 7 months ago

BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation

Paper • 2506.00482 • Published May 31, 2025 • 8

upvoted a paper 8 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Paper • 2505.11855 • Published May 17, 2025 • 10

upvoted a paper 9 months ago

ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition

Paper • 2503.21248 • Published Mar 27, 2025 • 21

upvoted 3 papers 10 months ago

PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning

Paper • 2502.12054 • Published Feb 17, 2025 • 7

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published Feb 26, 2025 • 65

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

Paper • 2502.17407 • Published Feb 24, 2025 • 26

upvoted 2 articles about 1 year ago

Article

Navigating Korean LLM Research #2: Evaluation Tools

Oct 23, 2024

•

8

Article

Navigating Korean LLM Research #1: Models

Oct 22, 2024

•

26

upvoted a paper over 2 years ago

HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

Paper • 2309.02706 • Published Sep 6, 2023 • 2