edinlp

community

AI & ML interests

None defined yet.

Recent Activity

simonycl authored a paper 7 days ago

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

simonycl authored a paper 14 days ago

GEM: A Gym for Agentic LLMs

simonycl authored a paper 4 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

View all activity

models 42

edinlp/qwen-2.5-base-rlhf-zero-iter6

8B • Updated Feb 22

edinlp/qwen-2.5-base-rlhf-zero-iter5

8B • Updated Feb 22

edinlp/qwen-2.5-base-rlhf-zero-iter4

8B • Updated Feb 22

edinlp/qwen-2.5-base-rlhf-zero-iter3

8B • Updated Feb 22

edinlp/qwen-2.5-base-rlhf-zero-iter2

8B • Updated Feb 22

edinlp/qwen-2.5-base-rlhf-zero-iter1

edinlp/qwen2-7b-offline-dpo

8B • Updated Nov 16, 2024

edinlp/llama-3-8b-offline-dpo

8B • Updated Nov 15, 2024

edinlp/mistral-7b-v0.3-dpo

Text Generation • 7B • Updated Oct 12, 2024 • 1

edinlp/mistral-7b-v0.3-sft

Text Generation • 7B • Updated Oct 11, 2024 • 1

datasets 1

edinlp/Countdown

Viewer • Updated Jun 4 • 329k • 5