Model Card for bert-base-swag-lora

Model Details

Model Description

This model is a bert-base-uncased model that has been fine-tuned for the multiple-choice question-answering task using the SWAG (Situations with Adversarial Generations) dataset. The fine-tuning was performed using a parameter-efficient technique called LoRA (Low-Rank Adaptation), which significantly reduces the number of trainable parameters while achieving strong performance.

The model is trained to predict the most plausible continuation of a sentence from four possible choices, testing its commonsense and contextual reasoning abilities.

Developed by: Taha Majlesi
Model type: BERT (Bidirectional Encoder Representations from Transformers)
Language(s) (NLP): English
License: Apache-2.0
Finetuned from model: google-bert/bert-base-uncased

Model Sources [optional]

Repository: https://huggingface.co/[Your Hugging Face Username]/bert-base-swag-lora
Paper [optional]: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding and LoRA: Low-Rank Adaptation of Large Language Models

Uses

Direct Use

This model is intended to be used for multiple-choice question answering, specifically on tasks that require commonsense inference similar to the SWAG dataset. It takes a context and four possible endings as input and outputs the index of the most likely correct ending.

from transformers import AutoModelForMultipleChoice, AutoTokenizer
from peft import PeftModel
import torch

# Define your repository name
repo_name = "[Your Hugging Face Username]/bert-base-swag-lora"
base_model_name = "google-bert/bert-base-uncased"

# Load the fine-tuned model from the Hub
tokenizer = AutoTokenizer.from_pretrained(repo_name)
base_model = AutoModelForMultipleChoice.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(base_model, repo_name)
model.eval()

# Example from SWAG
context = "A man is skiing down a mountain."
choices = [
    "he falls down and gets back up.",
    "he makes a snowball and throws it.",
    "he takes a picture of the scenery.",
    "he stops to drink some water."
]

# Prepare the input
prompt = [context] * 4
next_sentences = [f"{choices[i]}" for i in range(4)]
inputs = tokenizer(prompt, next_sentences, return_tensors="pt", padding=True)

# Reshape for the model
inputs = {k: v.unsqueeze(0) for k, v in inputs.items()}

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    predicted_index = torch.argmax(outputs.logits).item()

print(f"The most likely ending is: '{choices[predicted_index]}'")
# Expected output: 'he falls down and gets back up.'