Model Card for Model ID

Model description

odiagenAI-model-v1 is based on Llama-7b and finetuned with 171k Odia instruction set. The instruction set is translated data from open-source resources, resulting in good Odia instruction understanding and response generation capabilities.

The code of Odia data generation and other detailed information can be found in our Github project repository: https://github.com/OdiaGenAI/GenerativeAI_and_LLM_Odia. This repo contains a low-rank adapter for LLaMA-7b fit on the Stanford Alpaca dataset.

Training hyper-parameters

Parameter	Value
Batch size	128
Learning rate	3e-4
Epochs	3
Cutoff length	256
Weight_decay	0.001
Warmup_rate	0.1
LR_scheduler	linear
Lora r	16
Lora target modules	(q_proj, k_proj, v_proj, o_proj)

Model can be easily loaded with AutoModelForCausalLM.

import torch
from peft import PeftModel
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig

base_model_path = "meta-llama/Llama-2-7b-hf"
adapter_path = "OdiaGenAI/odiagenAI-model-v1"

tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_path,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(base_model, adapter_path)

instruction = "ଭାରତ ବିଷୟରେ କିଛି କୁହନ୍ତୁ"

device = "cuda" if torch.cuda.is_available() else "cpu"

inputs = tokenizer(instruction, return_tensors="pt").to(device)
input_ids = inputs["input_ids"].to(device)
generation_config = GenerationConfig(
    temperature=0.1,
    top_p=0.75,
    top_k=40,
    num_beams=4,
)
with torch.no_grad():
    generation_output = model.generate(
        input_ids=input_ids,
        generation_config=generation_config,
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=128,
    )
s = generation_output.sequences[0]
output = tokenizer.decode(s)
print(output)

Instructions for running it can be found at https://github.com/OdiaGenAI/GenerativeAI_and_LLM_Odia.

Licensing Information

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Citation Information

If you find this helpful repository, please consider giving 👏 and citing:

@misc{OdiaGenAI,
  author = {Shantipriya Parida and Sambit Sekhar and Subhadarshi Panda and Soumendra Kumar Sahoo and Swateek Jena and Abhijeet Parida and Arghyadeep Sen and Satya Ranjan Dash and Deepak Kumar Pradhan},
  title = {OdiaGenAI: Generative AI and LLM Initiative for the Odia Language},
  year = {2023},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/OdiaGenAI}},
}

Contributions

Shantipriya Parida
Sambit Sekhar