Papers
arxiv:2503.23714

Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models

Published on Mar 31
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Human-originated instruction-tuning datasets, paired with LLM-generated responses, outperform datasets generated solely from LLMs, enhancing performance across languages.

AI-generated summary

Instruction tuning is crucial for enabling Large Language Models (LLMs) to solve real-world tasks. Prior work has shown the effectiveness of instruction-tuning data synthesized solely from LLMs, raising a fundamental question: Do we still need human-originated signals for instruction tuning? This work answers the question affirmatively: we build state-of-the-art instruction-tuning datasets sourced from human-written instructions, by simply pairing them with LLM-generated responses. LLMs fine-tuned on our datasets consistently outperform those fine-tuned on existing ones. Our data construction approach can be easily adapted to other languages; we build datasets for Japanese and confirm that LLMs tuned with our data reach state-of-the-art performance. Analyses suggest that instruction-tuning in a new language allows LLMs to follow instructions, while the tuned models exhibit a notable lack of culture-specific knowledge in that language. The datasets and fine-tuned models will be publicly available. Our datasets, synthesized with open-weight LLMs, are openly distributed under permissive licenses, allowing for diverse use cases.

Community

Sign up or log in to comment

Models citing this paper 10

Browse 10 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2503.23714 in a dataset README.md to link it from this page.

Spaces citing this paper 11

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.