Rick Brannan
RickBrannan
AI & ML interests
translation, named entity recognition, text classification
Organizations
Datasets
Text Classification
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 10 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 158 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376
Low Resource Languages
-
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages
Paper • 2411.14343 • Published • 7 -
SPRING Lab IITM's submission to Low Resource Indic Language Translation Shared Task
Paper • 2411.00727 • Published • 1 -
Cross-lingual transfer of multilingual models on low resource African Languages
Paper • 2409.10965 • Published -
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
Paper • 2501.00874 • Published • 13
Hallucinations
-
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
Paper • 2501.08292 • Published • 17 -
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Paper • 2502.09604 • Published • 37 -
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
Paper • 2505.24858 • Published • 17 -
Investigating Hallucination in Conversations for Low Resource Languages
Paper • 2507.22720 • Published • 6
Machine Translation
Long Context
Multimodal RAG
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 30 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10 -
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations
Paper • 2504.00824 • Published • 43
Synthetic Data
Hallucinations
-
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
Paper • 2501.08292 • Published • 17 -
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Paper • 2502.09604 • Published • 37 -
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
Paper • 2505.24858 • Published • 17 -
Investigating Hallucination in Conversations for Low Resource Languages
Paper • 2507.22720 • Published • 6
Datasets
Machine Translation
Text Classification
-
LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification
Paper • 2411.19638 • Published • 6 -
Word Sense Linking: Disambiguating Outside the Sandbox
Paper • 2412.09370 • Published • 10 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 158 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376
Long Context
Low Resource Languages
-
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages
Paper • 2411.14343 • Published • 7 -
SPRING Lab IITM's submission to Low Resource Indic Language Translation Shared Task
Paper • 2411.00727 • Published • 1 -
Cross-lingual transfer of multilingual models on low resource African Languages
Paper • 2409.10965 • Published -
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
Paper • 2501.00874 • Published • 13
Multimodal RAG
-
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Paper • 2411.04952 • Published • 30 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10 -
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations
Paper • 2504.00824 • Published • 43