Community Archive

🔎 View Tweet

Placeholder
François Chollet@fchollet• about 2 years ago
Replying to @fchollet

Ok, by popular demand: a starter set of papers you can read on the topic. "Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks": https://t.co/nhDrr94sgK "Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve": https://t.co/vj1AZoZUBi "Faith and Fate: Limits of Transformers on Compositionality": https://t.co/TQ2yyBFxUW "The Reversal Curse: LLMs trained on "A is B" fail to learn 'B is A'": https://t.co/m2hXF5HDri "On the measure of intelligence": https://t.co/RjYH7Z3pmJ not about LLMs, but provides context and grounding on what it means to be intelligent and the nature of generalization. It also introduces an intelligence benchmark (ARC) that remains completely out of reach for LLMs. Ironically the best-performing LLM-based systems on ARC are those that have been trained on tons of generated tasks, hoping to hit some overlap between test set tasks and your generated tasks -- LLMs have zero ability to tackle an actually new task. In general there's a new paper documenting the lack of broad generalization capabilities of LLMs every few days.

1.1K 176
12/16/2023