Nomination for Best Paper Award at CoNLL’25

We’re excited to share our latest work, "Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review," recently published at CoNLL 2025 and nominated for the Best Paper Award.

Training costs for LLMs have increased by a staggering 750x every two years over the past decade, making scalable pretraining increasingly unsustainable. In response, we introduce Learn-Focus-Review (LFR) - a dynamic, human-inspired training paradigm that adapts to the model’s learning progress by prioritizing harder and frequently forgotten data, much like spaced repetition in human learning.

Unlike static data filtering or proxy-based selection, LFR directly monitors model performance during training, enabling focused learning and efficient review without relying on external models or redundant data curation.

We pretrain LLaMA models from scratch on web-scale datasets, and show that LFR-trained models consistently outperform state-of-the-art baselines, including models with up to 2x larger parameter counts, across diverse downstream tasks, all while using as little as 3.2% of the training tokens required by conventional approaches.

You can find the publication here: https://aclanthology.org/2025.conll-1.18/. We look forward to your feedback!