Data Selection for Language Models via Importance Resampling Paper • 2302.03169 • Published Feb 6, 2023
An Explanation of In-context Learning as Implicit Bayesian Inference Paper • 2111.02080 • Published Nov 3, 2021 • 1
Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation Paper • 2204.00570 • Published Apr 1, 2022
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning Paper • 2106.09226 • Published Jun 17, 2021
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models Paper • 2210.14199 • Published Oct 25, 2022
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining Paper • 2305.10429 • Published May 17, 2023 • 3