Retour aux articles
IAOpenAI News

Efficient training of language models to fill in the middle

OpenAI July 28, 2022 Publication Efficient training of language models to fill in the middle Read paper (opens in a new window) Loading… Share Abstract We show that autoregressive language models can learn...

Le flux RSS ne fournissait qu'un extrait. FlowMarket a récupéré le contenu public disponible depuis la page originale, sans contourner les contenus réservés.

July 28, 2022

Efficient training of language models to fill in the middle

Efficient Training Of Language Models To Fill In The Middle

Abstract

We show that autoregressive language models can learn to infill text after we apply a straightforward transformation to the dataset, which simply moves a span of text from the middle of a document to its end. While this data augmentation has garnered much interest in recent years, we provide extensive evidence that training models with a large fraction of data transformed in this way does not harm the original left-to-right generative capability, as measured by perplexity and sampling evaluations across a wide range of scales. Given the usefulness, simplicity, and efficiency of training models to fill-in-the-middle (FIM), we suggest that future autoregressive language models be trained with FIM by default. To this end, we run a series of ablations on key hyperparameters, such as the data transformation frequency, the structure of the transformation, and the method of selecting the infill span. We use these ablations to prescribe strong default settings and best practices to train FIM models. We have released our best infilling model trained with best practices in our API, and release our infilling benchmarks to aid future research.

  • GPT
  • Language
  • Learning Paradigms

Authors

Related articles

Three farmers using a mobile app outside

Jan 12, 2024

Wix cover image

May 29, 2025

WHOOP Coach HIIT

Jan 4, 2024

Besoin d'un workflow n8n ou d'aide pour l'installer ?

Après la veille, passez à l'action : trouvez un template n8n ou un créateur capable de l'adapter à vos outils.

Source

OpenAI News - openai.com

Voir la publication originale