38 18 8

Hamish Ivison

hamishivi

AireadMe's profile picture

penfever's profile picture

0xLaszlo's profile picture

https://ivison.id.au

hamishivi
hamishivi

AI & ML interests

NLP :)

Recent Activity

upvoted a paper about 22 hours ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

updated a model 5 days ago

hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1764018132_step_2450

published a model 5 days ago

hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1764018132_step_2450

View all activity

Organizations

hamishivi 's collections 8

RLVE

Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published 24 days ago • 13
hamishivi/OpenThinker3-1.5B-RLVE

Text Generation • 2B • Updated 23 days ago • 61 • 1
hamishivi/Nemotron-Research-Reasoning-Qwen-1.5B-v2-RLVE

Text Generation • 2B • Updated 23 days ago • 51 • 2

TESS 2

Models associated with the paper "TESS-2: A Large-Scale, Generalist Diffusion Language Model". Code: https://github.com/hamishivi/tess-2

TESS 2: A Large-Scale Generalist Diffusion Language Model

Paper • 2502.13917 • Published Feb 19 • 6
hamishivi/tess2-v0.3

7B • Updated Feb 20 • 45 • 5
hamishivi/tess2-v0.1

7B • Updated Feb 20 • 5
hamishivi/tess2-v0.3-base

7B • Updated Feb 20 • 14

7b tulu 2.5

a small run at 7b scale with ppo, following the unpacking dpo and ppo paper.

hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm

Text Generation • 7B • Updated Jun 25, 2024 • 5
hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm-value

Token Classification • 7B • Updated Jun 25, 2024 • 5
hamishivi/tulu-v2.5-7b-uf-rm

Text Classification • 7B • Updated Jun 25, 2024 • 2

Tulu V1 Suite

The set of models associated with the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".

allenai/tulu-65b

Text Generation • Updated Jun 29, 2023 • 66 • 21
allenai/tulu-30b

Text Generation • Updated Jun 20, 2023 • 61 • 18
allenai/tulu-13b

Text Generation • Updated Jun 20, 2023 • 71 • 8
allenai/tulu-7b

Text Generation • Updated Jun 20, 2023 • 72 • 9

Large-Scale Data Selection for Instruction Tuning

Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)

Large-Scale Data Selection for Instruction Tuning

Paper • 2503.01807 • Published Mar 3 • 14
hamishivi/tulu-2-multitask-rrmax-326k-sft

7B • Updated Mar 4 • 5
hamishivi/rds-sels-multitask-rrmax-top326k

Viewer • Updated Mar 4 • 326k • 75 • 1
hamishivi/llama-3.1-tulu-3-multitask-rrmax-939k-sft

Updated Mar 4 • 8

Tulu 2 Llama 3 Update

Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5).

allenai/llama-3.1-tulu-2-dpo-70b

71B • Updated Aug 15, 2024 • 57
allenai/llama-3.1-tulu-2-70b

71B • Updated Aug 15, 2024 • 61
allenai/llama-3.1-tulu-2-70b-uf-mean-rm

70B • Updated Aug 15, 2024 • 30
allenai/llama-3.1-tulu-2-dpo-8b

8B • Updated Aug 15, 2024 • 65 • 2

Tulu V2 Suite

The set of models associated with the Tulu V2 technical report.

allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 2.35k • 157
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 2.2k • • 20
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 4.46k • 20
allenai/tulu-2-70b

Text Generation • Updated Apr 19, 2024 • 63 • 8

LM Preference Datasets

lmsys/chatbot_arena_conversations

Viewer • Updated Sep 30, 2023 • 33k • 1.75k • 424
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 26.2k • 1.5k
openai/summarize_from_feedback

Viewer • Updated Jan 3, 2023 • 194k • 1.32k • 215
openai/webgpt_comparisons

Viewer • Updated Dec 19, 2022 • 19.6k • 9.04k • 237

RLVE

Models for "RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments" - https://arxiv.org/abs/2511.07317

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published 24 days ago • 13
hamishivi/OpenThinker3-1.5B-RLVE

Text Generation • 2B • Updated 23 days ago • 61 • 1
hamishivi/Nemotron-Research-Reasoning-Qwen-1.5B-v2-RLVE

Text Generation • 2B • Updated 23 days ago • 51 • 2

Large-Scale Data Selection for Instruction Tuning

Datasets and models associated with the paper "Large-Scale Data Selection for Instruction Tuning" (https://arxiv.org/abs/2503.01807)

Large-Scale Data Selection for Instruction Tuning

Paper • 2503.01807 • Published Mar 3 • 14
hamishivi/tulu-2-multitask-rrmax-326k-sft

7B • Updated Mar 4 • 5
hamishivi/rds-sels-multitask-rrmax-top326k

Viewer • Updated Mar 4 • 326k • 75 • 1
hamishivi/llama-3.1-tulu-3-multitask-rrmax-939k-sft

Updated Mar 4 • 8

TESS 2

Models associated with the paper "TESS-2: A Large-Scale, Generalist Diffusion Language Model". Code: https://github.com/hamishivi/tess-2

TESS 2: A Large-Scale Generalist Diffusion Language Model

Paper • 2502.13917 • Published Feb 19 • 6
hamishivi/tess2-v0.3

7B • Updated Feb 20 • 45 • 5
hamishivi/tess2-v0.1

7B • Updated Feb 20 • 5
hamishivi/tess2-v0.3-base

7B • Updated Feb 20 • 14

Tulu 2 Llama 3 Update

Llama 3 models trained on the tulu dataset, following https://arxiv.org/abs/2311.10702 (tulu 2) and https://arxiv.org/abs/2406.09279 (tulu 2.5).

allenai/llama-3.1-tulu-2-dpo-70b

71B • Updated Aug 15, 2024 • 57
allenai/llama-3.1-tulu-2-70b

71B • Updated Aug 15, 2024 • 61
allenai/llama-3.1-tulu-2-70b-uf-mean-rm

70B • Updated Aug 15, 2024 • 30
allenai/llama-3.1-tulu-2-dpo-8b

8B • Updated Aug 15, 2024 • 65 • 2

7b tulu 2.5

a small run at 7b scale with ppo, following the unpacking dpo and ppo paper.

hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm

Text Generation • 7B • Updated Jun 25, 2024 • 5
hamishivi/tulu-v2.5-7b-uf-mean-7b-uf-rm-value

Token Classification • 7B • Updated Jun 25, 2024 • 5
hamishivi/tulu-v2.5-7b-uf-rm

Text Classification • 7B • Updated Jun 25, 2024 • 2

Tulu V2 Suite

The set of models associated with the Tulu V2 technical report.

allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 2.35k • 157
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 2.2k • • 20
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 4.46k • 20
allenai/tulu-2-70b

Text Generation • Updated Apr 19, 2024 • 63 • 8

Tulu V1 Suite

The set of models associated with the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".

allenai/tulu-65b

Text Generation • Updated Jun 29, 2023 • 66 • 21
allenai/tulu-30b

Text Generation • Updated Jun 20, 2023 • 61 • 18
allenai/tulu-13b

Text Generation • Updated Jun 20, 2023 • 71 • 8
allenai/tulu-7b

Text Generation • Updated Jun 20, 2023 • 72 • 9

LM Preference Datasets

lmsys/chatbot_arena_conversations

Viewer • Updated Sep 30, 2023 • 33k • 1.75k • 424
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 26.2k • 1.5k
openai/summarize_from_feedback

Viewer • Updated Jan 3, 2023 • 194k • 1.32k • 215
openai/webgpt_comparisons

Viewer • Updated Dec 19, 2022 • 19.6k • 9.04k • 237