--- license: apache-2.0 license_link: https://huggingface.co/skt/A.X-3.1/blob/main/LICENSE language: - en - ko pipeline_tag: text-generation library_name: transformers model_id: skt/A.X-3.1 developers: SKT AI Model Lab model-index: - name: A.X-3.1 results: - task: type: generate_until name: mmlu dataset: name: mmlu (chat CoT) type: hails/mmlu_no_train metrics: - type: exact_match value: 75.1 name: exact_match - task: type: generate_until name: kmmlu dataset: name: kmmlu (chat CoT) type: HAERAE-HUB/KMMLU metrics: - type: exact_match value: 69.2 name: exact_match --- # A.X 3.1
A.X Logo

๐Ÿค— Models | ๐Ÿ–ฅ๏ธ Github

## A.X 3.1 Highlights SK Telecom released **A.X 3.1** (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 24, 2025. This sovereign AI model was developed entirely in-house by SKT, encompassing model architecture, data curation, and training, all carried out on SKTโ€™s proprietary supercomputing infrastructure, TITAN. The model was trained from scratch on a high-quality multilingual corpus comprising **2.1 trillion tokens**, with a primary focus on the Korean language. - **Authentic Korean Sovereign AI**: A.X 3.1 was trained on a high-quality multilingual datasetโ€”fully curated in-houseโ€”using SKTโ€™s proprietary GPU infrastructure. - **Highly Efficient Multilingual LLM**: A.X 3.1 demonstrates superior performance among Korean LLMs, despite its relatively compact training size of 2.1 trillion tokens. - **Superior Korean Proficiency**: A.X 3.1 achieved a score of **69.2** on the [KMMLU](https://huggingface.co/datasets/HAERAE-HUB/KMMLU): the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming other Korean-specified models. - **Deep Korean Understanding**: A.X 3.1 obtained **77.4** on the [CLIcK](https://huggingface.co/datasets/EunsuKim/CLIcK): a benchmark for Korean cultural and contextual comprehension, outperforming other open-source models. - **Efficient Token Usage**: A.X 3.1 requires approximately 33% fewer tokens than GPT-4o to process equivalent Korean inputs, facilitating more cost-effective and computationally efficient inference. - **Long-Context Handling**: A.X 3.1 supports up to **32,768 tokens** natively, and up to **131,072 tokens** by applying YaRN. ## Core Technologies A.X 3.1 represents **an efficient sovereign AI model**, developed end-to-end by SKT, encompassing model architecture, data curation, infrastructure deployment, and optimization. ### Model Architecture Specs
Model # Params # Layers # KV-Heads Hidden Dim FFN Dim
A.X 3.1 34B 48 8 8192 21824
### High-Quality Data Pipeline & Strategic Mixture - We collected and curated a training dataset comprising 20 trillion tokens sourced from diverse domains. - The entire dataset was processed through SKTโ€™s proprietary data pipeline, incorporating synthetic data generation and comprehensive quality filtering. - For training A.X 3.1, a total of **2.1 trillion tokens** were utilized, comprising a Korean-focused multilingual corpus. ## Benchmark Results ### Model Performance
* self-reported score
A.X 3.1 EXAONE-3.5-32B Kanana-flag-32.5B Gemma-3-27B Qwen2.5-32B
Knowledge KMMLU 69.73 57.17 64.19* 59.45 61.93
KMMLU-pro 54.89 45.39 - 50.43 52.34
KMMLU-redux 62.66 48.32 - 54.85 52.15
Click (chat CoT) 77.09 69.42 - 71.03 68.17
MMLU 75.20 77.1 81.08* 82.35 83.4
General Ko-MT-bench 83.06 80.19 80.58* 85.5 72.88
MT-bench 84.19 85.09 83.56* 84.38 87.31
IF Ko-IFEval 75.29 68.67 - 74.4 73.24
IFEval 87.11 82.67 85.6* 82.45 82.27
Math
HRM8K 45.53 36.3 - 48 41.29
MATH 75.40 61.64 57.82* 80.72 73.26
Code

HumanEval+ 75.00 77.44 77.44* 78.66 82.32
MBPP+ 70.90 65.87 69.84* 74.07 73.81
LiveCodeBench 23.34 17.2 - 30.55 26.9
### Lightweight Model Performance
Benchmarks A.X 3.1 Light Kanana-1.5-8B EXAONE-3.5-7.8B Qwen2.5-7B Qwen3-8B
(w/o reasoning)
Knowledge KMMLU 61.70 48.28 53.76 49.56 63.53
KMMLU-pro 45.54 37.63 40.11 38.87 50.71
KMMLU-redux 52.34 35.33 42.21 38.58 55.74
CLIcK 71.22 61.30 64.11 58.30 63.31
KoBALT 27.43 23.14 21.71 21.57 26.57
MMLU 66.95 68.82 72.20 75.40 82.89
General Ko-MT-Bench 78.56 76.30 81.06 61.31 64.06
MT-Bench 74.38 77.60 83.50 79.37 65.69
Instruction
Following
Ko-IFEval 70.04 69.96 65.01 60.73 73.39
IFEval 79.86 80.11 82.61 76.73 85.38
Math HRM8K 41.70 30.87 31.88 35.13 52.50
MATH 70.14 59.28 63.20 65.58 71.48
Code
HumanEval+ 73.78 76.83 76.83 74.39 77.44
MBPP+ 61.64 67.99 64.29 68.50 62.17
## ๐Ÿš€ Quickstart ### with HuggingFace Transformers - `transformers>=4.46.0` or the latest version is required to use `skt/A.X-3.1` ```bash pip install transformers>=4.46.0 ``` #### Example Usage ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "skt/A.X-3.1" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", ) model.eval() tokenizer = AutoTokenizer.from_pretrained(model_name) messages = [ {"role": "system", "content": "๋‹น์‹ ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ œ๊ณตํ•˜๋Š” ์˜์–ด ๋ฌธ์žฅ๋“ค์„ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญํ•˜๋Š” AI ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค."}, {"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."}, ] input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate( input_ids, max_new_tokens=128, do_sample=False, ) len_input_prompt = len(input_ids[0]) response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True) print(response) # Output: # ์šฐ์ฃผ์—์„œ ์ธ๊ฐ„์ด ์ฒ˜์Œ์œผ๋กœ ์ง€๊ตฌ ๊ถค๋„๋ฅผ ๋ˆ ๋‚ ์€ 1961๋…„ 4์›” 12์ผ์ž…๋‹ˆ๋‹ค. ``` ### with vLLM - `vllm>=v0.6.4.post1` or the latest version is required to use tool-use feature ```bash pip install vllm>=v0.6.4.post1 # if you don't want to activate tool-use feature, just commenting out below vLLM option VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes" vllm serve skt/A.X-3.1 $VLLM_OPTION ``` #### Example Usage ```python from openai import OpenAI def call(messages, model): completion = client.chat.completions.create( model=model, messages=messages, ) print(completion.choices[0].message) client = OpenAI( base_url="http://localhost:8000/v1", api_key="api_key" ) model = "skt/A.X-3.1" messages = [{"role": "user", "content": "์—์–ด์ปจ ์—ฌ๋ฆ„์ฒ  ์ ์ • ์˜จ๋„๋Š”? ํ•œ์ค„๋กœ ๋‹ต๋ณ€ํ•ด์ค˜"}] call(messages, model) # Output: # ์—ฌ๋ฆ„์ฒ  ์—์–ด์ปจ ์ ์ • ์˜จ๋„๋Š” 24~26๋„์ž…๋‹ˆ๋‹ค. messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Respond in a single sentence."}] call(messages, model) # Output: # The appropriate temperature for air conditioning in summer is around 78ยฐF (26ยฐC). ``` #### Examples for tool-use ```python from openai import OpenAI def call(messages, model): completion = client.chat.completions.create( model=model, messages=messages, tools=tools ) print(completion.choices[0].message) client = OpenAI( base_url="http://localhost:8000/v1", api_key="api_key" ) model = "skt/A.X-3.1" calculate_discount = { "type": "function", "function": { "name": "calculate_discount", "description": "์›๊ฐ€๊ฒฉ๊ณผ ํ• ์ธ์œจ(ํผ์„ผํŠธ ๋‹จ์œ„)์„ ์ž…๋ ฅ๋ฐ›์•„ ํ• ์ธ๋œ ๊ฐ€๊ฒฉ์„๊ณ„์‚ฐํ•œ๋‹ค.", "parameters": { "type": "object", "properties": { "original_price": { "type": "number", "description": "์ƒํ’ˆ์˜ ์›๋ž˜ ๊ฐ€๊ฒฉ" }, "discount_percentage": { "type": "number", "description": "์ ์šฉํ•  ํ• ์ธ์œจ" } }, "required": ["original_price", "discount_percentage"] } } } get_exchange_rate = { "type": "function", "function": { "name": "get_exchange_rate", "description": "๋‘ ํ†ตํ™” ๊ฐ„์˜ ํ™˜์œจ์„ ๊ฐ€์ ธ์˜จ๋‹ค.", "parameters": { "type": "object", "properties": { "base_currency": { "type": "string", "description": "The currency to convert from." }, "target_currency": { "type": "string", "description": "The currency to convert to." } }, "required": ["base_currency", "target_currency"] } } } tools = [calculate_discount, get_exchange_rate] ### Slot filling ### messages = [{"role": "user", "content": "์šฐ๋ฆฌ๊ฐ€ ๋ญ˜ ์‚ฌ์•ผ๋˜๋Š”๋ฐ ์›๊ฐ€๊ฐ€ 57600์›์ธ๋ฐ ์ง์›ํ• ์ธ ๋ฐ›์œผ๋ฉด ์–ผ๋งˆ์•ผ?"}] call(messages, model) # Output: # ChatCompletionMessage(content='์ง์› ํ• ์ธ์œจ์ด ๋ช‡ ํผ์„ผํŠธ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹ ๋‹ค๋ฉด ํ• ์ธ๋œ ๊ฐ€๊ฒฉ์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ• ์ธ์œจ์ด ๋ช‡ ํผ์„ผํŠธ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹ค ์ˆ˜ ์žˆ๋‚˜์š”?', role='assistant', tool_calls=[]) ### Function calling ### messages = [ {"role": "user", "content": "์šฐ๋ฆฌ๊ฐ€ ๋ญ˜ ์‚ฌ์•ผ๋˜๋Š”๋ฐ ์›๊ฐ€๊ฐ€ 57600์›์ธ๋ฐ ์ง์›ํ• ์ธ ๋ฐ›์œผ๋ฉด ์–ผ๋งˆ์•ผ?"}, {"role": "assistant", "content": "์ง์› ํ• ์ธ์œจ์ด ๋ช‡ ํผ์„ผํŠธ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹ ๋‹ค๋ฉด ํ• ์ธ๋œ ๊ฐ€๊ฒฉ์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ• ์ธ์œจ์ด ๋ช‡ ํผ์„ผํŠธ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹ค ์ˆ˜ ์žˆ๋‚˜์š”?"}, {"role": "user", "content": "15% ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด."}, ] call(messages, model) # Output: # ChatCompletionMessage(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-cb9e827f752d4725abc94377223b2b0f', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')]) ### Completion ### messages = [ {"role": "user", "content": "์šฐ๋ฆฌ๊ฐ€ ๋ญ˜ ์‚ฌ์•ผ๋˜๋Š”๋ฐ ์›๊ฐ€๊ฐ€ 57600์›์ธ๋ฐ ์ง์›ํ• ์ธ ๋ฐ›์œผ๋ฉด ์–ผ๋งˆ์•ผ?"}, {"role": "assistant", "content": "์ง์› ํ• ์ธ์œจ์ด ๋ช‡ ํผ์„ผํŠธ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹ ๋‹ค๋ฉด ํ• ์ธ๋œ ๊ฐ€๊ฒฉ์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ• ์ธ์œจ์ด ๋ช‡ ํผ์„ผํŠธ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹ค ์ˆ˜ ์žˆ๋‚˜์š”?"}, {"role": "user", "content": "15% ํ• ์ธ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด."}, {"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"} ] call(messages, model) # Output: # ChatCompletionMessage(content='์ง์› ํ• ์ธ์„ ๋ฐ›์œผ๋ฉด 57600์›์˜ ์ƒํ’ˆ์€ 15% ํ• ์ธ์„ ๋ฐ›์•„ 48960์›์ด ๋ฉ๋‹ˆ๋‹ค.', role='assistant', tool_calls=[]) ``` ### Extend supported token length The `config.json` file of A.X 3.1 uploaded to HuggingFace is configured for maximum token lengths of 32,768. You can simply handle up to 131,072 tokens by modifying `rope_scaling` field in `config.json` file into the following parameters: ``` "rope_scaling": { "type": "yarn", "factor": 4.0, "original_max_position_embeddings": 32768, }, ``` ## License The `A.X 3.1` model is licensed under `Apache License 2.0`. ## Citation ``` @article{SKTAdotX3.1, title={A.X 3.1}, author={SKT AI Model Lab}, year={2025}, url={https://huggingface.co/skt/A.X-3.1} } ``` ## Contact - Business & Partnership Contact: [a.x@sk.com](a.x@sk.com)