DiffuBERTa: JSON Extraction Adapter

This model is a Fine-tuned version of answerdotai/ModernBERT-base using LoRA. It is designed to extract structured JSON data from unstructured text using a parallel decoding approach.

Model Performance

Final Training Loss: 4.7773
Final Evaluation Loss: 4.316555023193359
Training Epochs: 5
Date Trained: 2025-11-28

🚀 Live Demo Output

(Generated automatically after training)

Input Text:

"We are excited to welcome Dr. Sarah to our Paris office as Senior Data Scientist."

Template:

{'name': '[1]', 'job': '[2]', 'city': '[1]'}

Model Output:

{
  "name": "Sarah",
  "job": "Data scientist",
  "city": "Paris"
}

Usage

from transformers import AutoModelForMaskedLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base")
model = PeftModel.from_pretrained(base_model, "philipp-zettl/DiffuBERTa")
# ... use extract_parallel helper ...

Downloads last month: 172

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support