Model Card for bert-base-swag-lora
Model Details
Model Description
This model is a bert-base-uncased
model that has been fine-tuned for the multiple-choice question-answering task using the SWAG (Situations with Adversarial Generations) dataset. The fine-tuning was performed using a parameter-efficient technique called LoRA (Low-Rank Adaptation), which significantly reduces the number of trainable parameters while achieving strong performance.
The model is trained to predict the most plausible continuation of a sentence from four possible choices, testing its commonsense and contextual reasoning abilities.
- Developed by: Taha Majlesi
- Model type: BERT (Bidirectional Encoder Representations from Transformers)
- Language(s) (NLP): English
- License: Apache-2.0
- Finetuned from model:
google-bert/bert-base-uncased
Model Sources [optional]
- Repository:
https://huggingface.co/[Your Hugging Face Username]/bert-base-swag-lora
- Paper [optional]: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding and LoRA: Low-Rank Adaptation of Large Language Models
Uses
Direct Use
This model is intended to be used for multiple-choice question answering, specifically on tasks that require commonsense inference similar to the SWAG dataset. It takes a context and four possible endings as input and outputs the index of the most likely correct ending.
from transformers import AutoModelForMultipleChoice, AutoTokenizer
from peft import PeftModel
import torch
# Define your repository name
repo_name = "[Your Hugging Face Username]/bert-base-swag-lora"
base_model_name = "google-bert/bert-base-uncased"
# Load the fine-tuned model from the Hub
tokenizer = AutoTokenizer.from_pretrained(repo_name)
base_model = AutoModelForMultipleChoice.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(base_model, repo_name)
model.eval()
# Example from SWAG
context = "A man is skiing down a mountain."
choices = [
"he falls down and gets back up.",
"he makes a snowball and throws it.",
"he takes a picture of the scenery.",
"he stops to drink some water."
]
# Prepare the input
prompt = [context] * 4
next_sentences = [f"{choices[i]}" for i in range(4)]
inputs = tokenizer(prompt, next_sentences, return_tensors="pt", padding=True)
# Reshape for the model
inputs = {k: v.unsqueeze(0) for k, v in inputs.items()}
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
predicted_index = torch.argmax(outputs.logits).item()
print(f"The most likely ending is: '{choices[predicted_index]}'")
# Expected output: 'he falls down and gets back up.'