You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Merchant Name Extraction Model

This model extracts merchant names from transaction descriptions using Named Entity Recognition (NER).

Model Details

Model Type: DistilBERT for Token Classification
Task: Merchant Name Extraction
Language: English
Framework: PyTorch + Transformers

Usage

from transformers import DistilBertTokenizerFast, DistilBertForTokenClassification
import torch

# Load model and tokenizer
model = DistilBertForTokenClassification.from_pretrained("GalalEwida/SIA-MerchentName")
tokenizer = DistilBertTokenizerFast.from_pretrained("GalalEwida/SIA-MerchentName")

# Prediction function
def extract_merchant(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.argmax(outputs.logits, dim=2)
    
    tokens = tokenizer.convert_ids_to_tokens(inputs['input_ids'][0])
    id2label = {0: 'O', 1: 'B-MERCHANT', 2: 'I-MERCHANT'}
    predicted_labels = [id2label[pred.item()] for pred in predictions[0]]
    
    merchant_tokens = []
    for token, label in zip(tokens, predicted_labels):
        if label in ['B-MERCHANT', 'I-MERCHANT']:
            if token.startswith('##'):
                if merchant_tokens:
                    merchant_tokens[-1] += token[2:]
            else:
                merchant_tokens.append(token)
    
    return ' '.join(merchant_tokens)

# Example usage
text = "WALMART SUPERCENTER #1234 ANYTOWN US"
merchant = extract_merchant(text)
print(f"Extracted: {merchant}")

Labels

O: Outside (not part of merchant name)
B-MERCHANT: Beginning of merchant name
I-MERCHANT: Inside merchant name

Example Predictions

Input	Extracted Merchant
WALMART SUPERCENTER #1234 ANYTOWN US	WALMART
AMAZON.COM AMZN.COM/BILL WA	AMAZON
STARBUCKS STORE #0123 NEW YORK NY	STARBUCKS

Downloads last month: -

Safetensors

Model size

66.4M params

Tensor type

F32