Abstract
In this paper, we use toy models — small ReLU networks trained on synthetic data with sparse input features — to investigate how and when models represent more features than they have dimensions. We call this phenomenon superposition. When features are sparse, superposition allows compression beyond what a linear model would do, at the cost of "interference" that requires nonlinear filtering.
Related content
2028: Two scenarios for global AI leadership
Our views on the AI competition between the US and China.
Read moreNatural Language Autoencoders: Turning Claude’s thoughts into text
AI models like Claude talk in words but think in numbers. In this study we train Claude to translate its thoughts into human-readable text.
Read more