Oriya
code

Model Card for Model ID

License: CC BY-NC-SA 4.0

Model description

odiagenAI-model-v1 is based on Llama-7b and finetuned with 171k Odia instruction set. The instruction set is translated data from open-source resources, resulting in good Odia instruction understanding and response generation capabilities.

The code of Odia data generation and other detailed information can be found in our Github project repository: https://github.com/OdiaGenAI/GenerativeAI_and_LLM_Odia. This repo contains a low-rank adapter for LLaMA-7b fit on the Stanford Alpaca dataset.

Training hyper-parameters

Parameter Value
Batch size 128
Learning rate 3e-4
Epochs 3
Cutoff length 256
Weight_decay 0.001
Warmup_rate 0.1
LR_scheduler linear
Lora r 16
Lora target modules (q_proj, k_proj, v_proj, o_proj)

Model can be easily loaded with AutoModelForCausalLM.

import torch
from peft import PeftModel
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig

base_model_path = "meta-llama/Llama-2-7b-hf"
adapter_path = "OdiaGenAI/odiagenAI-model-v1"

tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_path,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(base_model, adapter_path)

instruction = "ଭାରତ ବିଷୟରେ କିଛି କୁହନ୍ତୁ"

device = "cuda" if torch.cuda.is_available() else "cpu"

inputs = tokenizer(instruction, return_tensors="pt").to(device)
input_ids = inputs["input_ids"].to(device)
generation_config = GenerationConfig(
    temperature=0.1,
    top_p=0.75,
    top_k=40,
    num_beams=4,
)
with torch.no_grad():
    generation_output = model.generate(
        input_ids=input_ids,
        generation_config=generation_config,
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=128,
    )
s = generation_output.sequences[0]
output = tokenizer.decode(s)
print(output)

Instructions for running it can be found at https://github.com/OdiaGenAI/GenerativeAI_and_LLM_Odia.

Licensing Information

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Citation Information

If you find this helpful repository, please consider giving 👏 and citing:

@misc{OdiaGenAI,
  author = {Shantipriya Parida and Sambit Sekhar and Subhadarshi Panda and Soumendra Kumar Sahoo and Swateek Jena and Abhijeet Parida and Arghyadeep Sen and Satya Ranjan Dash and Deepak Kumar Pradhan},
  title = {OdiaGenAI: Generative AI and LLM Initiative for the Odia Language},
  year = {2023},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/OdiaGenAI}},
}

Contributions

  • Shantipriya Parida
  • Sambit Sekhar
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support