Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.01954

Papers - Multilingual - Benchmarks

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
ByT5: Towards a token-free future with pre-trained byte-to-byte models

Paper • 2105.13626 • Published May 28, 2021 • 3
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 107

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 38
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Text - Supervised Fine-tuning

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Pre-training - In-filling - PSM and SPM ordering

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Reward Model - Bradley-Terry

https://web.stanford.edu/class/archive/stats/stats200/stats200.1172/Lecture24.pdf

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 62
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15, 2024 • 12
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 88

Papers - Fine-tuning - PPO

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 125

Papers - Text - Supervised Fine-tuning - Batch Grouping

Batches are grouped by similar token length to help optimize gpu/hardware. Mini batch lengths are different but the max number of tokens is the same.

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Pre-training - Dynamic Context Length

For HyperClova X they split 90% at 4096 and 10% at 32k context length during pt

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28, 2024 • 12
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3

Technical Report

Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 66
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 49
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 36
Gemma: Open Models Based on Gemini Research and Technology

Paper • 2403.08295 • Published Mar 13, 2024 • 50

Papers - Multilingual - Benchmarks

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
ByT5: Towards a token-free future with pre-trained byte-to-byte models

Paper • 2105.13626 • Published May 28, 2021 • 3
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 107

Papers - Fine-tuning - PPO

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 125

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 38
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Text - Supervised Fine-tuning - Batch Grouping

Batches are grouped by similar token length to help optimize gpu/hardware. Mini batch lengths are different but the max number of tokens is the same.

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Text - Supervised Fine-tuning

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Pre-training - Dynamic Context Length

For HyperClova X they split 90% at 4096 and 10% at 32k context length during pt

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Papers - Pre-training - In-filling - PSM and SPM ordering

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26

Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28, 2024 • 12
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3

Papers - Reward Model - Bradley-Terry

https://web.stanford.edu/class/archive/stats/stats200/stats200.1172/Lecture24.pdf

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 62
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 26
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15, 2024 • 12
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 88

Technical Report

Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 66
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 49
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 36
Gemma: Open Models Based on Gemini Research and Technology

Paper • 2403.08295 • Published Mar 13, 2024 • 50

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs