LEXIS Gemma-3-12B-IT โ Obligation Generator
QLoRA fine-tuned google/gemma-3-12b-it for legal obligation extraction from contracts.
Training
| Config |
Value |
| Quantization |
4-bit NF4 + double quant |
| LoRA |
r=64, alpha=128, dropout=0.05 |
| Target Modules |
q/k/v/o_proj, gate/up/down_proj |
| Learning rate |
2e-4 (cosine schedule) |
| Epochs |
3 |
| Effective batch |
32 (4 x 8 grad accum) |
| Max seq length |
2048 |
| Precision |
bf16 |
| Dataset |
CUAD (obligation-filtered) |
Evaluation Results
| Metric |
Baseline |
Fine-Tuned |
Improvement |
| Valid JSON |
100.0% |
100.0% |
+0.0% |
| Schema Valid |
100.0% |
100.0% |
+0.0% |
| Deontic Acc |
58.0% |
77.0% |
+19.0% |
| ROUGE-L |
55.6% |
84.5% |
+28.9% |
| Field Comp |
100.0% |
100.0% |
+0.0% |
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("coder123d/lexis-gemma3-12b-obligation-generator")
tokenizer = AutoTokenizer.from_pretrained("coder123d/lexis-gemma3-12b-obligation-generator")
messages = [{"role": "user", "content": "Extract the legal obligation from: ...\n\nReturn JSON."}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt')
output = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(output[0], skip_special_tokens=True))