A.X 3.1

A.X 3.1 Highlights

SK Telecom released A.X 3.1 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 24, 2025. This sovereign AI model was developed entirely in-house by SKT, encompassing model architecture, data curation, and training, all carried out on SKT’s proprietary supercomputing infrastructure, TITAN. The model was trained from scratch on a high-quality multilingual corpus comprising 2.1 trillion tokens, with a primary focus on the Korean language.

Authentic Korean Sovereign AI: A.X 3.1 was trained on a high-quality multilingual dataset—fully curated in-house—using SKT’s proprietary GPU infrastructure.
Highly Efficient Multilingual LLM: A.X 3.1 demonstrates superior performance among Korean LLMs, despite its relatively compact training size of 2.1 trillion tokens.
Superior Korean Proficiency: A.X 3.1 achieved a score of 69.2 on the KMMLU: the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming other Korean-specified models.
Deep Korean Understanding: A.X 3.1 obtained 77.4 on the CLIcK: a benchmark for Korean cultural and contextual comprehension, outperforming other open-source models.
Efficient Token Usage: A.X 3.1 requires approximately 33% fewer tokens than GPT-4o to process equivalent Korean inputs, facilitating more cost-effective and computationally efficient inference.
Long-Context Handling: A.X 3.1 supports up to 32,768 tokens natively, and up to 131,072 tokens by applying YaRN.

Core Technologies

A.X 3.1 represents an efficient sovereign AI model, developed end-to-end by SKT, encompassing model architecture, data curation, infrastructure deployment, and optimization.

Model Architecture Specs

Model	# Params	# Layers	# KV-Heads	Hidden Dim	FFN Dim
A.X 3.1	34B	48	8	8192	21824

High-Quality Data Pipeline & Strategic Mixture

We collected and curated a training dataset comprising 20 trillion tokens sourced from diverse domains.
The entire dataset was processed through SKT’s proprietary data pipeline, incorporating synthetic data generation and comprehensive quality filtering.
For training A.X 3.1, a total of 2.1 trillion tokens were utilized, comprising a Korean-focused multilingual corpus.

Benchmark Results

Model Performance

* self-reported score
		A.X 3.1	EXAONE-3.5-32B	Kanana-flag-32.5B	Gemma-3-27B	Qwen2.5-32B
Knowledge	KMMLU	69.73	57.17	64.19*	59.45	61.93
	KMMLU-pro	54.89	45.39	-	50.43	52.34
	KMMLU-redux	62.66	48.32	-	54.85	52.15
	Click (chat CoT)	77.09	69.42	-	71.03	68.17
	MMLU	75.20	77.1	81.08*	82.35	83.4
General	Ko-MT-bench	83.06	80.19	80.58*	85.5	72.88
General	MT-bench	84.19	85.09	83.56*	84.38	87.31
IF	Ko-IFEval	75.29	68.67	-	74.4	73.24
IF	IFEval	87.11	82.67	85.6*	82.45	82.27
Math	HRM8K	45.53	36.3	-	48	41.29
Math	MATH	75.40	61.64	57.82*	80.72	73.26
Code	HumanEval+	75.00	77.44	77.44*	78.66	82.32
	MBPP+	70.90	65.87	69.84*	74.07	73.81
	LiveCodeBench	23.34	17.2	-	30.55	26.9

Lightweight Model Performance

Benchmarks		A.X 3.1 Light	Kanana-1.5-8B	EXAONE-3.5-7.8B	Qwen2.5-7B	Qwen3-8B (w/o reasoning)
Knowledge	KMMLU	61.70	48.28	53.76	49.56	63.53
	KMMLU-pro	45.54	37.63	40.11	38.87	50.71
	KMMLU-redux	52.34	35.33	42.21	38.58	55.74
	CLIcK	71.22	61.30	64.11	58.30	63.31
	KoBALT	27.43	23.14	21.71	21.57	26.57
	MMLU	66.95	68.82	72.20	75.40	82.89
General	Ko-MT-Bench	78.56	76.30	81.06	61.31	64.06
General	MT-Bench	74.38	77.60	83.50	79.37	65.69
Instruction Following	Ko-IFEval	70.04	69.96	65.01	60.73	73.39
Instruction Following	IFEval	79.86	80.11	82.61	76.73	85.38
Math	HRM8K	41.70	30.87	31.88	35.13	52.50
Math	MATH	70.14	59.28	63.20	65.58	71.48
Code	HumanEval+	73.78	76.83	76.83	74.39	77.44
Code	MBPP+	61.64	67.99	64.29	68.50	62.17

🚀 Quickstart

with HuggingFace Transformers

transformers>=4.46.0 or the latest version is required to use skt/A.X-3.1

pip install transformers>=4.46.0

Example Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "skt/A.X-3.1"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "당신은 사용자가 제공하는 영어 문장들을 한국어로 번역하는 AI 전문가입니다."},
    {"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."},
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        input_ids,
        max_new_tokens=128,
        do_sample=False,
    )

len_input_prompt = len(input_ids[0])
response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True)
print(response)
# Output:
# 우주에서 인간이 처음으로 지구 궤도를 돈 날은 1961년 4월 12일입니다.

with vLLM

vllm>=v0.6.4.post1 or the latest version is required to use tool-use feature

pip install vllm>=v0.6.4.post1
# if you don't want to activate tool-use feature, just commenting out below vLLM option
VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes"
vllm serve skt/A.X-3.1 $VLLM_OPTION

Example Usage

from openai import OpenAI

def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
    )
    print(completion.choices[0].message)

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-3.1"
messages = [{"role": "user", "content": "에어컨 여름철 적정 온도는? 한줄로 답변해줘"}]
call(messages, model)
# Output:
# 여름철 에어컨 적정 온도는 24~26도입니다.

messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Respond in a single sentence."}]
call(messages, model)
# Output:
# The appropriate temperature for air conditioning in summer is around 78°F (26°C).

Examples for tool-use

from openai import OpenAI


def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools
    )
    print(completion.choices[0].message)


client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-3.1"

calculate_discount = {
    "type": "function",
    "function": {
        "name": "calculate_discount",
        "description": "원가격과 할인율(퍼센트 단위)을 입력받아 할인된 가격을계산한다.",
        "parameters": {
            "type": "object",
            "properties": {
                "original_price": {
                    "type": "number",
                    "description": "상품의 원래 가격"
                },
                "discount_percentage": {
                    "type": "number",
                    "description": "적용할 할인율"
                }
            },
            "required": ["original_price", "discount_percentage"]
        }
    }
}
get_exchange_rate = {
    "type": "function",
    "function": {
        "name": "get_exchange_rate",
        "description": "두 통화 간의 환율을 가져온다.",
        "parameters": {
            "type": "object",
            "properties": {
                "base_currency": {
                    "type": "string",
                    "description": "The currency to convert from."
                },
                "target_currency": {
                    "type": "string",
                    "description": "The currency to convert to."
                }
            },
            "required": ["base_currency", "target_currency"]
        }
    }
}
tools = [calculate_discount, get_exchange_rate]

### Slot filling ###
messages = [{"role": "user", "content": "우리가 뭘 사야되는데 원가가 57600원인데 직원할인 받으면 얼마야?"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='직원 할인율이 몇 퍼센트인지 알려주신다면 할인된 가격을 계산할 수 있습니다. 할인율이 몇 퍼센트인지 알려주실 수 있나요?', role='assistant', tool_calls=[])


### Function calling ###
messages = [
    {"role": "user", "content": "우리가 뭘 사야되는데 원가가 57600원인데 직원할인 받으면 얼마야?"},
    {"role": "assistant", "content": "직원 할인율이 몇 퍼센트인지 알려주신다면 할인된 가격을 계산할 수 있습니다. 할인율이 몇 퍼센트인지 알려주실 수 있나요?"},
    {"role": "user", "content": "15% 할인 받을 수 있어."},
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-cb9e827f752d4725abc94377223b2b0f', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')])


### Completion ###
messages = [
    {"role": "user", "content": "우리가 뭘 사야되는데 원가가 57600원인데 직원할인 받으면 얼마야?"},
    {"role": "assistant", "content": "직원 할인율이 몇 퍼센트인지 알려주신다면 할인된 가격을 계산할 수 있습니다. 할인율이 몇 퍼센트인지 알려주실 수 있나요?"},
    {"role": "user", "content": "15% 할인 받을 수 있어."},
    {"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"}
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content='직원 할인을 받으면 57600원의 상품은 15% 할인을 받아 48960원이 됩니다.', role='assistant', tool_calls=[])

Extend supported token length

The config.json file of A.X 3.1 uploaded to HuggingFace is configured for maximum token lengths of 32,768. You can simply handle up to 131,072 tokens by modifying rope_scaling field in config.json file into the following parameters:

"rope_scaling": {
  "type": "yarn",
  "factor": 4.0,
  "original_max_position_embeddings": 32768,
},

License

The A.X 3.1 model is licensed under Apache License 2.0.

Citation

@article{SKTAdotX3.1,
  title={A.X 3.1},
  author={SKT AI Model Lab},
  year={2025},
  url={https://huggingface.co/skt/A.X-3.1}
}

Contact

Business & Partnership Contact: [email protected]

skt
/

A.X-3.1

A.X 3.1

A.X 3.1 Highlights

Core Technologies

Model Architecture Specs

High-Quality Data Pipeline & Strategic Mixture

Benchmark Results

Model Performance

Lightweight Model Performance

🚀 Quickstart

with HuggingFace Transformers

Example Usage

with vLLM

Example Usage

Examples for tool-use

Extend supported token length

License

Citation

Contact

Model tree for skt/A.X-3.1

Collection including skt/A.X-3.1

A.X 3

Evaluation results