A.X 3.1

A.X Logo

🤗 Models | 🖥️ Github

A.X 3.1 Highlights

SK Telecom released A.X 3.1 (pronounced "A dot X"), a large language model (LLM) optimized for Korean-language understanding and enterprise deployment, on July 24, 2025. This sovereign AI model was developed entirely in-house by SKT, encompassing model architecture, data curation, and training, all carried out on SKT’s proprietary supercomputing infrastructure, TITAN. The model was trained from scratch on a high-quality multilingual corpus comprising 2.1 trillion tokens, with a primary focus on the Korean language.

  • Authentic Korean Sovereign AI: A.X 3.1 was trained on a high-quality multilingual dataset—fully curated in-house—using SKT’s proprietary GPU infrastructure.
  • Highly Efficient Multilingual LLM: A.X 3.1 demonstrates superior performance among Korean LLMs, despite its relatively compact training size of 2.1 trillion tokens.
  • Superior Korean Proficiency: A.X 3.1 achieved a score of 69.2 on the KMMLU: the leading benchmark for Korean-language evaluation and a Korean-specific adaptation of MMLU, outperforming other Korean-specified models.
  • Deep Korean Understanding: A.X 3.1 obtained 77.4 on the CLIcK: a benchmark for Korean cultural and contextual comprehension, outperforming other open-source models.
  • Efficient Token Usage: A.X 3.1 requires approximately 33% fewer tokens than GPT-4o to process equivalent Korean inputs, facilitating more cost-effective and computationally efficient inference.
  • Long-Context Handling: A.X 3.1 supports up to 32,768 tokens natively, and up to 131,072 tokens by applying YaRN.

Core Technologies

A.X 3.1 represents an efficient sovereign AI model, developed end-to-end by SKT, encompassing model architecture, data curation, infrastructure deployment, and optimization.

Model Architecture Specs

Model # Params # Layers # KV-Heads Hidden Dim FFN Dim
A.X 3.1 34B 48 8 8192 21824

High-Quality Data Pipeline & Strategic Mixture

  • We collected and curated a training dataset comprising 20 trillion tokens sourced from diverse domains.
  • The entire dataset was processed through SKT’s proprietary data pipeline, incorporating synthetic data generation and comprehensive quality filtering.
  • For training A.X 3.1, a total of 2.1 trillion tokens were utilized, comprising a Korean-focused multilingual corpus.

Benchmark Results

Model Performance

* self-reported score
A.X 3.1 EXAONE-3.5-32B Kanana-flag-32.5B Gemma-3-27B Qwen2.5-32B
Knowledge KMMLU 69.73 57.17 64.19* 59.45 61.93
KMMLU-pro 54.89 45.39 - 50.43 52.34
KMMLU-redux 62.66 48.32 - 54.85 52.15
Click (chat CoT) 77.09 69.42 - 71.03 68.17
MMLU 75.20 77.1 81.08* 82.35 83.4
General Ko-MT-bench 83.06 80.19 80.58* 85.5 72.88
MT-bench 84.19 85.09 83.56* 84.38 87.31
IF Ko-IFEval 75.29 68.67 - 74.4 73.24
IFEval 87.11 82.67 85.6* 82.45 82.27
Math
HRM8K 45.53 36.3 - 48 41.29
MATH 75.40 61.64 57.82* 80.72 73.26
Code

HumanEval+ 75.00 77.44 77.44* 78.66 82.32
MBPP+ 70.90 65.87 69.84* 74.07 73.81
LiveCodeBench 23.34 17.2 - 30.55 26.9

Lightweight Model Performance

Benchmarks A.X 3.1 Light Kanana-1.5-8B EXAONE-3.5-7.8B Qwen2.5-7B Qwen3-8B
(w/o reasoning)
Knowledge KMMLU 61.70 48.28 53.76 49.56 63.53
KMMLU-pro 45.54 37.63 40.11 38.87 50.71
KMMLU-redux 52.34 35.33 42.21 38.58 55.74
CLIcK 71.22 61.30 64.11 58.30 63.31
KoBALT 27.43 23.14 21.71 21.57 26.57
MMLU 66.95 68.82 72.20 75.40 82.89
General Ko-MT-Bench 78.56 76.30 81.06 61.31 64.06
MT-Bench 74.38 77.60 83.50 79.37 65.69
Instruction
Following
Ko-IFEval 70.04 69.96 65.01 60.73 73.39
IFEval 79.86 80.11 82.61 76.73 85.38
Math HRM8K 41.70 30.87 31.88 35.13 52.50
MATH 70.14 59.28 63.20 65.58 71.48
Code
HumanEval+ 73.78 76.83 76.83 74.39 77.44
MBPP+ 61.64 67.99 64.29 68.50 62.17

🚀 Quickstart

with HuggingFace Transformers

  • transformers>=4.46.0 or the latest version is required to use skt/A.X-3.1
pip install transformers>=4.46.0

Example Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "skt/A.X-3.1"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "당신은 사용자가 제공하는 영어 문장들을 한국어로 번역하는 AI 전문가입니다."},
    {"role": "user", "content": "The first human went into space and orbited the Earth on April 12, 1961."},
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        input_ids,
        max_new_tokens=128,
        do_sample=False,
    )

len_input_prompt = len(input_ids[0])
response = tokenizer.decode(output[0][len_input_prompt:], skip_special_tokens=True)
print(response)
# Output:
# 우주에서 인간이 처음으로 지구 궤도를 돈 날은 1961년 4월 12일입니다.

with vLLM

  • vllm>=v0.6.4.post1 or the latest version is required to use tool-use feature
pip install vllm>=v0.6.4.post1
# if you don't want to activate tool-use feature, just commenting out below vLLM option
VLLM_OPTION="--enable-auto-tool-choice --tool-call-parser hermes"
vllm serve skt/A.X-3.1 $VLLM_OPTION

Example Usage

from openai import OpenAI

def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
    )
    print(completion.choices[0].message)

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-3.1"
messages = [{"role": "user", "content": "에어컨 여름철 적정 온도는? 한줄로 답변해줘"}]
call(messages, model)
# Output:
# 여름철 에어컨 적정 온도는 24~26도입니다.

messages = [{"role": "user", "content": "What is the appropriate temperature for air conditioning in summer? Respond in a single sentence."}]
call(messages, model)
# Output:
# The appropriate temperature for air conditioning in summer is around 78°F (26°C).

Examples for tool-use

from openai import OpenAI


def call(messages, model):
    completion = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools
    )
    print(completion.choices[0].message)


client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="api_key"
)
model = "skt/A.X-3.1"

calculate_discount = {
    "type": "function",
    "function": {
        "name": "calculate_discount",
        "description": "원가격과 할인율(퍼센트 단위)을 입력받아 할인된 가격을계산한다.",
        "parameters": {
            "type": "object",
            "properties": {
                "original_price": {
                    "type": "number",
                    "description": "상품의 원래 가격"
                },
                "discount_percentage": {
                    "type": "number",
                    "description": "적용할 할인율"
                }
            },
            "required": ["original_price", "discount_percentage"]
        }
    }
}
get_exchange_rate = {
    "type": "function",
    "function": {
        "name": "get_exchange_rate",
        "description": "두 통화 간의 환율을 가져온다.",
        "parameters": {
            "type": "object",
            "properties": {
                "base_currency": {
                    "type": "string",
                    "description": "The currency to convert from."
                },
                "target_currency": {
                    "type": "string",
                    "description": "The currency to convert to."
                }
            },
            "required": ["base_currency", "target_currency"]
        }
    }
}
tools = [calculate_discount, get_exchange_rate]

### Slot filling ###
messages = [{"role": "user", "content": "우리가 뭘 사야되는데 원가가 57600원인데 직원할인 받으면 얼마야?"}]
call(messages, model)
# Output:
# ChatCompletionMessage(content='직원 할인율이 몇 퍼센트인지 알려주신다면 할인된 가격을 계산할 수 있습니다. 할인율이 몇 퍼센트인지 알려주실 수 있나요?', role='assistant', tool_calls=[])


### Function calling ###
messages = [
    {"role": "user", "content": "우리가 뭘 사야되는데 원가가 57600원인데 직원할인 받으면 얼마야?"},
    {"role": "assistant", "content": "직원 할인율이 몇 퍼센트인지 알려주신다면 할인된 가격을 계산할 수 있습니다. 할인율이 몇 퍼센트인지 알려주실 수 있나요?"},
    {"role": "user", "content": "15% 할인 받을 수 있어."},
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-cb9e827f752d4725abc94377223b2b0f', function=Function(arguments='{"original_price": 57600, "discount_percentage": 15}', name='calculate_discount'), type='function')])


### Completion ###
messages = [
    {"role": "user", "content": "우리가 뭘 사야되는데 원가가 57600원인데 직원할인 받으면 얼마야?"},
    {"role": "assistant", "content": "직원 할인율이 몇 퍼센트인지 알려주신다면 할인된 가격을 계산할 수 있습니다. 할인율이 몇 퍼센트인지 알려주실 수 있나요?"},
    {"role": "user", "content": "15% 할인 받을 수 있어."},
    {"role": "tool", "tool_call_id": "random_id", "name": "calculate_discount", "content": "{\"original_price\": 57600, \"discount_percentage\": 15, \"discounted_price\": 48960.0}"}
]
call(messages, model)
# Output: 
# ChatCompletionMessage(content='직원 할인을 받으면 57600원의 상품은 15% 할인을 받아 48960원이 됩니다.', role='assistant', tool_calls=[])

Extend supported token length

The config.json file of A.X 3.1 uploaded to HuggingFace is configured for maximum token lengths of 32,768. You can simply handle up to 131,072 tokens by modifying rope_scaling field in config.json file into the following parameters:

"rope_scaling": {
  "type": "yarn",
  "factor": 4.0,
  "original_max_position_embeddings": 32768,
},

License

The A.X 3.1 model is licensed under Apache License 2.0.

Citation

@article{SKTAdotX3.1,
  title={A.X 3.1},
  author={SKT AI Model Lab},
  year={2025},
  url={https://huggingface.co/skt/A.X-3.1}
}

Contact

Downloads last month
367
Safetensors
Model size
34.7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for skt/A.X-3.1

Quantizations
1 model

Collection including skt/A.X-3.1

Evaluation results