Community Archive

🔎 View Tweet

Nathan Lambert@natolambert• about 1 year ago

As the only AI lab that can share 100% of the details of training language models, at @allen_ai we're really kind of obligated to share more on how it works (and what doesn't). Here's a reflection with @mechanicaldirk @kylelostat & @soldni on OLMo 2 and what comes next! 00:00:00 Introduction 00:02:45 Early history of the OLMo project 00:15:27 The journey to stability 00:25:00 The evolving role of OLMo and pretraining research 00:29:00 Pretraining Q&A (µP, scaling laws, MoE, etc.) 00:40:40 How to think about pretraining data work 00:54:30 Role of pre-training vs mid training vs post-training 01:02:19 Release strategy and wrapping up Links below.

417 80

1/22/2025