IL-PCSR (Indian Legal β Precedent & Statute Retrieval)
Ensemble Model: A hybrid approach combining lexical features (BM25 5-gram) with semantic/distributional features (Para-GNN) with dynamic weighting between features, that is effective for both legal statute as well as prior case retrieval.
Summary of Model files
We have 5 files for 3 different types of models:
only_secs_model.bin, only_precs_model.binβ separate models for LSR and PCR, ft. independentlymulti_task_model.binβ single model for both LSR and PCR, ft. together in a multi-task setuppipeline_secs.bin, pipeline_precs.binβ separate models for LSR and PCR obtained via transfer learning (pipeline_secs.binis obtained by LSR training ononly_precs_model.bin, i.e., transfer PCR --> LSR, and vice versa)
All of these models have been trained with summaries of queries and precedents, and not full documents.
How to Use
All of the examples assume you have access (i.e., gate accepted). You need to use huggingface_hub to download the model contents to a local file, after which it can be loaded like any standard PyTorch model.
from huggingface_hub import hf_hub_download
# Download to local file
file_path = hf_hub_download(
repo_id="Exploration-Lab/IL-PCSR-Models",
filename="multitask_model.bin"
)
print("Model weights downloaded to:", file_path)
import torch
# Load the state dict in pytorch
trained_state_dict = torch.load(file_path, map_location=torch.device('cpu'))
Citation
@inproceedings{il-pcsr2025,
title = "IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval",
author = "Paul, Shounak and Ghumare, Dhananjay and Goyal, Pawan and Ghosh, Saptarshi and Modi, Ashutosh"
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
note = "To Appear"
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support