Model Card for Model ID
Model description
odiagenAI-model-v1 is based on Llama-7b and finetuned with 171k Odia instruction set. The instruction set is translated data from open-source resources, resulting in good Odia instruction understanding and response generation capabilities.
The code of Odia data generation and other detailed information can be found in our Github project repository: https://github.com/OdiaGenAI/GenerativeAI_and_LLM_Odia. This repo contains a low-rank adapter for LLaMA-7b fit on the Stanford Alpaca dataset.
Training hyper-parameters
Parameter | Value |
---|---|
Batch size | 128 |
Learning rate | 3e-4 |
Epochs | 3 |
Cutoff length | 256 |
Weight_decay | 0.001 |
Warmup_rate | 0.1 |
LR_scheduler | linear |
Lora r | 16 |
Lora target modules | (q_proj, k_proj, v_proj, o_proj) |
Model can be easily loaded with AutoModelForCausalLM.
import torch
from peft import PeftModel
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
base_model_path = "meta-llama/Llama-2-7b-hf"
adapter_path = "OdiaGenAI/odiagenAI-model-v1"
tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.float16,
)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_path,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True
)
model = PeftModel.from_pretrained(base_model, adapter_path)
instruction = "ଭାରତ ବିଷୟରେ କିଛି କୁହନ୍ତୁ"
device = "cuda" if torch.cuda.is_available() else "cpu"
inputs = tokenizer(instruction, return_tensors="pt").to(device)
input_ids = inputs["input_ids"].to(device)
generation_config = GenerationConfig(
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4,
)
with torch.no_grad():
generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=128,
)
s = generation_output.sequences[0]
output = tokenizer.decode(s)
print(output)
Instructions for running it can be found at https://github.com/OdiaGenAI/GenerativeAI_and_LLM_Odia.
Licensing Information
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Citation Information
If you find this helpful repository, please consider giving 👏 and citing:
@misc{OdiaGenAI,
author = {Shantipriya Parida and Sambit Sekhar and Subhadarshi Panda and Soumendra Kumar Sahoo and Swateek Jena and Abhijeet Parida and Arghyadeep Sen and Satya Ranjan Dash and Deepak Kumar Pradhan},
title = {OdiaGenAI: Generative AI and LLM Initiative for the Odia Language},
year = {2023},
publisher = {Hugging Face},
journal = {Hugging Face repository},
howpublished = {\url{https://huggingface.co/OdiaGenAI}},
}
Contributions
- Shantipriya Parida
- Sambit Sekhar