Prioritising research on risks posed by AI
Anthropic makes its model Claude available to safety- and societal impacts-minded researchers via a special research access program and has done so in select cases since 2022 – before Claude was commercially available – in line with Anthropic’s public benefit mission. Our July 2023 blog post on Frontier Threats Red Teaming for AI Safety goes into greater detail on a core area of research for Anthropic at the intersection of societal, safety, and security risks. Our Alignment, Assurance, Interpretability, Security, Societal Impacts, and Trust & Safety teams are among the teams at Anthropic driving internal research areas in these domains. Specialists based outside of those six teams are also leading core elements of this work – for example, the work on Frontier Threats Red Teaming is led by a Member of Technical Staff (and former UK Government official) who is launching a new research team to accelerate Anthropic’s work in this arena.