Reformulating the RL of reasoning LLMs through Markovian Thinking paradigm.
AI & ML interests
computational linguistics, natural language processing
Recent Activity
View all activity
Papers
Value Drifts: Tracing Value Alignment During LLM Post-Training
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
-
McGill-NLP/AfroXLMR-large-76L-Injongo-intent
Text Classification • 0.6B • Updated • 5 -
McGill-NLP/AfroXLMR-large-76L-Injongo-slot
Token Classification • 0.6B • Updated • 5 -
McGill-NLP/gemma-2-9b-it-Injongo-intent
Text Generation • 9B • Updated • 9 -
McGill-NLP/gemma-2-9b-it-Injongo-slot
Text Generation • 9B • Updated • 7
-
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Paper • 2504.08942 • Published • 28 -
McGill-NLP/agent-reward-bench
Viewer • Updated • 1.41k • 5.96k • 4 -
Agent Reward Bench Demo
💻4Explore agent trajectories and judgments in web benchmarks
-
Agent Reward Bench Leaderboard
🥇3Leaderboard for AgentRewardBench
-
McGill-NLP/LLM2Vec-Meta-Llama-32-3B-Instruct-mntp-supervised
Updated -
McGill-NLP/LLM2Vec-Meta-Llama-31-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 74 • 4 -
McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 9.89k • 50 -
McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised
Sentence Similarity • Updated • 331 • 13
Repository: https://github.com/McGill-NLP/AURORA
mcgill-nlp.github.io/statcan-dialogue-dataset
-
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Paper • 2304.01412 • Published • 2 -
McGill-NLP/statcan-dialogue-dataset
Preview • Updated • 16 • 7 -
McGill-NLP/dpr-statcan-conversation_encoder-title
Feature Extraction • 0.1B • Updated • 8 -
McGill-NLP/tapas-statcan-large-conversation_encoder-cell_tokens
Feature Extraction • Updated • 3
-
Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval
Paper • 2104.08801 • Published • 1 -
McGill-NLP/mlquestions
Updated • 118 • 3 -
McGill-NLP/bart-qg-mlquestions-backtraining
Updated • 17 -
McGill-NLP/bart-qg-mlquestions-selftraining
Updated • 5
Reformulating the RL of reasoning LLMs through Markovian Thinking paradigm.
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
-
McGill-NLP/AfroXLMR-large-76L-Injongo-intent
Text Classification • 0.6B • Updated • 5 -
McGill-NLP/AfroXLMR-large-76L-Injongo-slot
Token Classification • 0.6B • Updated • 5 -
McGill-NLP/gemma-2-9b-it-Injongo-intent
Text Generation • 9B • Updated • 9 -
McGill-NLP/gemma-2-9b-it-Injongo-slot
Text Generation • 9B • Updated • 7
Datasets used for the OLMo experiments in the "Not All Data are Unlearned Equally" paper https://arxiv.org/abs/2504.05058
-
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Paper • 2504.08942 • Published • 28 -
McGill-NLP/agent-reward-bench
Viewer • Updated • 1.41k • 5.96k • 4 -
Agent Reward Bench Demo
💻4Explore agent trajectories and judgments in web benchmarks
-
Agent Reward Bench Leaderboard
🥇3Leaderboard for AgentRewardBench
Generate challenging synthetic data to evaluate LLMs
-
McGill-NLP/LLM2Vec-Meta-Llama-32-3B-Instruct-mntp-supervised
Updated -
McGill-NLP/LLM2Vec-Meta-Llama-31-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 74 • 4 -
McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 9.89k • 50 -
McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised
Sentence Similarity • Updated • 331 • 13
https://mcgill-nlp.github.io/weblinx
Repository: https://github.com/McGill-NLP/AURORA
https://mcgill-nlp.github.io/weblinx
mcgill-nlp.github.io/statcan-dialogue-dataset
-
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Paper • 2304.01412 • Published • 2 -
McGill-NLP/statcan-dialogue-dataset
Preview • Updated • 16 • 7 -
McGill-NLP/dpr-statcan-conversation_encoder-title
Feature Extraction • 0.1B • Updated • 8 -
McGill-NLP/tapas-statcan-large-conversation_encoder-cell_tokens
Feature Extraction • Updated • 3
-
Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval
Paper • 2104.08801 • Published • 1 -
McGill-NLP/mlquestions
Updated • 118 • 3 -
McGill-NLP/bart-qg-mlquestions-backtraining
Updated • 17 -
McGill-NLP/bart-qg-mlquestions-selftraining
Updated • 5