Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.01335

Moral Foundations of Large Language Models

Paper • 2310.15337 • Published Oct 23, 2023 • 1
Specific versus General Principles for Constitutional AI

Paper • 2310.13798 • Published Oct 20, 2023 • 3
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 25
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 51

Tradecraft Patterns

Methods and analysis for generating synthetic data to populate graphs at scale, based on network motif (patterns) of tradecraft.

InGram: Inductive Knowledge Graph Embedding via Relation Graphs

Paper • 2305.19987 • Published May 31, 2023 • 2
Curating Grounded Synthetic Data with Global Perspectives for Equitable A

Paper • 2406.10258 • Published Jun 10, 2024 • 1
Peregrine: A Pattern-Aware Graph Mining System

Paper • 2004.02369 • Published Apr 6, 2020 • 1
OFFER: A Motif Dimensional Framework for Network Representation Learning

Paper • 2008.12010 • Published Aug 27, 2020 • 1

Papers - Fine-tuning

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 54
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44

DIBT Prompt collective SPIN

This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset

argilla/zephyr-7b-spin-iter0-v0

Text Generation • 7B • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter1-v0

Text Generation • 7B • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter2-v0

Text Generation • 7B • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter3-v0

Text Generation • 7B • Updated Mar 13, 2024 • 10 • 8

A Critical Evaluation of AI Feedback for Aligning Large Language Models

Paper • 2402.12366 • Published Feb 19, 2024 • 3
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16, 2024 • 37
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23, 2024 • 10
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 28

shisa-v2-research

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 70
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 102
argilla/magpie-ultra-v1.0

Viewer • Updated Nov 26, 2024 • 3.22M • 1.33k • 47
simplescaling/s1K-1.1

Viewer • Updated Feb 27 • 1k • 8.06k • 126

Synthetic Data Generation

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 145
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Paper • 2305.07759 • Published May 12, 2023 • 36
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 102

Self Improvement

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2, 2024 • 68

ibm-research/AttaQ

Viewer • Updated Jan 26, 2024 • 1.4k • 1.31k • 18
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11, 2024 • 92 • 10
corbyrosset/researchy_questions

Viewer • Updated Feb 29, 2024 • 96.4k • 580 • 29
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 880 • 80

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 110
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 43
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 23
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15, 2024 • 39

Moral Foundations of Large Language Models

Paper • 2310.15337 • Published Oct 23, 2023 • 1
Specific versus General Principles for Constitutional AI

Paper • 2310.13798 • Published Oct 20, 2023 • 3
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 25
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 51

shisa-v2-research

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 70
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 102
argilla/magpie-ultra-v1.0

Viewer • Updated Nov 26, 2024 • 3.22M • 1.33k • 47
simplescaling/s1K-1.1

Viewer • Updated Feb 27 • 1k • 8.06k • 126

Tradecraft Patterns

Methods and analysis for generating synthetic data to populate graphs at scale, based on network motif (patterns) of tradecraft.

InGram: Inductive Knowledge Graph Embedding via Relation Graphs

Paper • 2305.19987 • Published May 31, 2023 • 2
Curating Grounded Synthetic Data with Global Perspectives for Equitable A

Paper • 2406.10258 • Published Jun 10, 2024 • 1
Peregrine: A Pattern-Aware Graph Mining System

Paper • 2004.02369 • Published Apr 6, 2020 • 1
OFFER: A Motif Dimensional Framework for Network Representation Learning

Paper • 2008.12010 • Published Aug 27, 2020 • 1

Synthetic Data Generation

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 145
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Paper • 2305.07759 • Published May 12, 2023 • 36
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 102

Papers - Fine-tuning

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 54
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44

Self Improvement

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2, 2024 • 68

DIBT Prompt collective SPIN

This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset

argilla/zephyr-7b-spin-iter0-v0

Text Generation • 7B • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter1-v0

Text Generation • 7B • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter2-v0

Text Generation • 7B • Updated Mar 13, 2024 • 7 • 1
argilla/zephyr-7b-spin-iter3-v0

Text Generation • 7B • Updated Mar 13, 2024 • 10 • 8

ibm-research/AttaQ

Viewer • Updated Jan 26, 2024 • 1.4k • 1.31k • 18
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11, 2024 • 92 • 10
corbyrosset/researchy_questions

Viewer • Updated Feb 29, 2024 • 96.4k • 580 • 29
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 880 • 80

A Critical Evaluation of AI Feedback for Aligning Large Language Models

Paper • 2402.12366 • Published Feb 19, 2024 • 3
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16, 2024 • 37
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23, 2024 • 10
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 28

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 110
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 43
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 23
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15, 2024 • 39

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs