InterpretabilityResearch

Reflections on Qualitative Research

Mar 8, 2024
Read Transformer Circuits

This note offers some opinionated thoughts on why interpretability research may have qualitative aspects be more central than we're used to in other fields. It also aims to describe some heuristics for research taste in qualitative work.

Related content

2028: Two scenarios for global AI leadership

Our views on the AI competition between the US and China.

Read more

Teaching Claude why

New research on how we've reduced agentic misalignment.

Read more

Natural Language Autoencoders: Turning Claude’s thoughts into text

AI models like Claude talk in words but think in numbers. In this study, we train Claude to translate its thoughts into human-readable text.

Read more