sinjy1203
/

EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval

Text Classification

text-generation

text-generation-inference

Model card Files Files and versions

EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval / README.md

sinjy1203's picture

Update README.md

4e2e2b3 verified over 1 year ago

|

history blame contribute delete

3.45 kB

	---
	language:
	- ko
	license: apache-2.0
	library_name: transformers
	tags:
	- text-generation-inference
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	pipeline_tag: text-classification
	---

	# EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval


	## About the Model
	This model has been fine-tuned to evaluate whether the retrieved context for a question in RAG is correct with a yes or no answer.

	The base model for this model is [yanolja/EEVE-Korean-Instruct-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0).

	## Prompt Template
	```
	주어진 질문과 정보가 주어졌을 때 질문에 답하기에 충분한 정보인지 평가해줘.
	정보가 충분한지를 평가하기 위해 "예" 또는 "아니오"로 답해줘.

	### 질문:
	{question}

	### 정보:
	{context}

	### 평가:
	```

	## How to Use it
	```python
	import torch
	from transformers import (
	BitsAndBytesConfig,
	AutoModelForCausalLM,
	AutoTokenizer,
	)

	model_path = "sinjy1203/EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval"
	nf4_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_use_double_quant=True,
	bnb_4bit_compute_dtype=torch.float16,
	)

	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(
	model_path, quantization_config=nf4_config, device_map={'': 'cuda:0'}
	)

	prompt_template = '주어진 질문과 정보가 주어졌을 때 질문에 답하기에 충분한 정보인지 평가해줘.\n정보가 충분한지를 평가하기 위해 "예" 또는 "아니오"로 답해줘.\n\n### 질문:\n{question}\n\n### 정보:\n{context}\n\n### 평가:\n'
	query = {
	"question": "동아리 종강총회가 언제인가요?",
	"context": "종강총회 날짜는 6월 21일입니다."
	}

	model_inputs = tokenizer(prompt_template.format_map(query), return_tensors='pt')
	output = model.generate(**model_inputs, max_new_tokens=100, max_length=200)
	print(output)
	```

	### Example Output
	```
	주어진 질문과 정보가 주어졌을 때 질문에 답하기에 충분한 정보인지 평가해줘.
	정보가 충분한지를 평가하기 위해 "예" 또는 "아니오"로 답해줘.

	### 질문:
	동아리 종강총회가 언제인가요?

	### 정보:
	종강총회 날짜는 6월 21일입니다.

	### 평가:
	예<\|end_of_text\|>
	```

	### Training Data
	- Referenced generated_instruction by [stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca)
	- use [yanolja/EEVE-Korean-Instruct-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) as the model for question generation.

	## Metrics

	### Korean LLM Benchmark

	\| Model \| Average \| Ko-ARC \| Ko-HellaSwag \| Ko-MMLU \| Ko-TruthfulQA \| Ko-CommonGen V2\|
	\|:-------------------------------:\|:--------:\|:-----:\|:---------:\|:------:\|:------:\|:------:\|
	\| EEVE-Korean-Instruct-10.8B-v1.0 \| 56.08 \| 55.2 \| 66.11 \| 56.48 \| 49.14 \| 53.48 \|
	\| EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval \| 56.1 \| 55.55 \| 65.95 \| 56.24 \| 48.66 \| 54.07 \|

	### Generated Dataset

	\| Model \| Accuracy \| F1 \| Precision \| Recall \|
	\|:-------------------------------:\|:--------:\|:-----:\|:---------:\|:------:\|
	\| EEVE-Korean-Instruct-10.8B-v1.0 \| 0.824 \| 0.800 \| 0.885 \| 0.697 \|
	\| EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval \| 0.892 \| 0.875 \| 0.903 \| 0.848 \|