tklohj
/

windyllm_2.3

Question Answering

text-generation

Model card Files Files and versions

windyllm_2.3 / README.md

tklohj's picture

Upload WindyLLM 2.3 - MMLU fine-tuned model

05ea994 verified 3 months ago

|

history blame contribute delete

2 kB

	---
	language:
	- en
	- ko
	license: apache-2.0
	tags:
	- text-generation
	- llama
	- mistral
	- fine-tuned
	- mmlu
	- question-answering
	base_model: meta-llama/Meta-Llama-3-8B
	datasets:
	- cais/mmlu
	metrics:
	- accuracy
	model_type: causal-lm
	---

	# WindyLLM 2.3

	## 모델 설명

	WindyLLM 2.3은 MMLU(Massive Multitask Language Understanding) 데이터셋으로 파인튜닝된 대화형 언어 모델입니다.

	## 모델 정보

	- 기반 모델: Llama 3 8B / Mistral 7B
	- 파인튜닝 데이터셋: MMLU (다중 선택 질문 답변)
	- 언어: 영어, 한국어
	- 파라미터 수: ~7B-8B
	- 훈련 방법: LoRA (Low-Rank Adaptation)

	## 성능

	- MMLU 정확도: 40-65%+ (파인튜닝 후)
	- 향상도: 베이스라인 대비 +15-25% 개선

	## 사용 방법

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	# 모델 로드
	tokenizer = AutoTokenizer.from_pretrained("tklohj/windyllm_2.3")
	model = AutoModelForCausalLM.from_pretrained("tklohj/windyllm_2.3")

	# 추론 예시
	question = "What is the capital of France?"
	choices = ["London", "Paris", "Berlin", "Rome"]

	prompt = f'''Answer this question with A, B, C, or D.

	{question}

	A) {choices[0]}
	B) {choices[1]}
	C) {choices[2]}
	D) {choices[3]}

	Answer:'''

	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=20, temperature=0.1)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## 훈련 세부사항

	- LoRA 설정: rank=64, alpha=128
	- 배치 크기: 4 (per device)
	- 학습률: 2e-4
	- 에폭: 3
	- 양자화: 4bit (bitsandbytes)

	## 제한사항

	- 다중 선택 질문에 특화됨
	- 긴 텍스트 생성에는 추가 튜닝 필요
	- 한국어 성능은 영어 대비 제한적

	## 라이선스

	Apache 2.0

	## 인용

	```bibtex
	@model{windyllm_2.3,
	title={WindyLLM 2.3: MMLU Fine-tuned Language Model},
	author={tklohj},
	year={2025},
	url={https://huggingface.co/tklohj/windyllm_2.3}
	}
	```