Programs

About

Publications

Get Updates

Programs

Scenario Research

Governance Research

AI Awareness

About

About Us

Our Team

How We Work

Theory of Change

Blog

Donate

Get Updates

Programs

Scenario Research

Governance Research

AI Awareness

About

About Us

Our Team

How We Work

Theory of Change

Blog

Donate

Get Updates

Publications

Forging a New AGI Social Contract

Deric Cheng

Apr 14, 2025

Forging a New AGI Social Contract

Deric Cheng

Apr 14, 2025

Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

Jul 4, 2024

Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

Jul 4, 2024

AI Emergency Preparedness

Akash Wasil, Everett Smith, Corin Katzke, Justin Bullock

Jul 11, 2024

AI Emergency Preparedness

Akash Wasil, Everett Smith, Corin Katzke, Justin Bullock

Jul 11, 2024

AI, Global Governance, and Digital Sovereignty

Swati Srivastava, Justin Bullock

Oct 23, 2024

AI, Global Governance, and Digital Sovereignty

Swati Srivastava, Justin Bullock

Oct 23, 2024

Aligning AI Safety Projects with a Republican Administration

Deric Cheng

Nov 21, 2024

Aligning AI Safety Projects with a Republican Administration

Deric Cheng

Nov 21, 2024

AI Model Registries: A Foundational Tool for AI Governance

Elliot McKernon, Gwyn Glasser, Deric Cheng, Gillian Hadfield

Oct 4, 2024

AI Model Registries: A Foundational Tool for AI Governance

Elliot McKernon, Gwyn Glasser, Deric Cheng, Gillian Hadfield

Oct 4, 2024

Soft Nationalization: How the US Government Will Control AI Labs

Deric Cheng, Corin Katzke

Aug 28, 2024

Soft Nationalization: How the US Government Will Control AI Labs

Deric Cheng, Corin Katzke

Aug 28, 2024

2024 State of the AI Regulatory Landscape

Deric Cheng, Elliot McKernon

May 27, 2024

2024 State of the AI Regulatory Landscape

Deric Cheng, Elliot McKernon

May 27, 2024

Threshold 2030: Modeling AI Economic Futures

Deric Cheng, Elliot McKernon, Deger Turan, Yashvardhan Sharma, Alex Foster, Justin Bullock

Feb 24, 2025

Threshold 2030: Modeling AI Economic Futures

Deric Cheng, Elliot McKernon, Deger Turan, Yashvardhan Sharma, Alex Foster, Justin Bullock

Feb 24, 2025

Pathways to short TAI timelines

Zershaaneh Qureshi

Feb 20, 2025

Pathways to short TAI timelines

Zershaaneh Qureshi

Feb 20, 2025

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

Corin Katzke, Gideon Futerman

Jan 17, 2025

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

Corin Katzke, Gideon Futerman

Jan 17, 2025

Analysis of Global AI Governance Strategies

Sammy Martin, Justin Bullock, Corin Katzke

Dec 4, 2024

Analysis of Global AI Governance Strategies

Sammy Martin, Justin Bullock, Corin Katzke

Dec 4, 2024

AI governance needs a theory of victory

Corin Katzke, Justin Bullock

Jun 21, 2024

AI governance needs a theory of victory

Corin Katzke, Justin Bullock

Jun 21, 2024

Investigating the role of agency in AI x-risk

Corin Katzke

Apr 13, 2024

Investigating the role of agency in AI x-risk

Corin Katzke

Apr 13, 2024

Scenario Research

Governance Research

Advocacy & Education

Existential Risk Strategy

Information Hazards & Downside Risks

AI Safety

Scenario Research

A research program by Convergence that explores potential scenarios and evaluates strategies for controlling the trajectory of AI.

Conference Report

Threshold 2030: Modeling AI Economic Futures

Threshold 2030 was a two-day conference hosted October 30-31st, 2024, in Boston, Massachusetts. It brought together 30 leading economists, AI policy experts, and professional forecasters to rapidly evaluate the economic impacts of frontier AI technologies by 2030.

Deric Cheng, Elliot McKernon, Deger Turan, Yashvardhan Sharma, Alex Foster, Justin Bullock

Feb 24, 2025

250

minute read

Conference Report

Threshold 2030: Modeling AI Economic Futures

Deric Cheng, Elliot McKernon, Deger Turan, Yashvardhan Sharma, Alex Foster, Justin Bullock

Feb 24, 2025

250

minute read

Research Agenda

AI Clarity: An Initial Research Agenda

AI Clarity’s research method centers on scenario planning. Scenario planning is an analytical tool used by policymakers, strategists, and academics to explore and prepare for the landscape of possible outcomes in domains defined by uncertainty.

Justin Bullock, Corin Katzke, Zershaaneh Qureshi, David Kristoffersson

Apr 16, 2024

minute read

Research Agenda

AI Clarity: An Initial Research Agenda

Justin Bullock, Corin Katzke, Zershaaneh Qureshi, David Kristoffersson

Apr 16, 2024

minute read

A Taxonomy of Jobs Deeply Resistant to TAI Automation

This is a light, informal taxonomy of jobs that conceptually will display significant resistance to automation, even by transformative AI systems that can outperform humans in all forms of cognitive and physical labor.

Deric Cheng

Mar 18, 2025

minute read

A Taxonomy of Jobs Deeply Resistant to TAI Automation

Deric Cheng

Mar 18, 2025

minute read

A Taxonomy of Jobs Deeply Resistant to TAI Automation

Deric Cheng

Mar 18, 2025

minute read

Pathways to short TAI timelines

This report explores pathways through which transformative AI (TAI) could be developed within the next ten years (‘short TAI timelines’). It examines compute scaling and recursive improvement as key mechanisms for AI capabilities progress, describes seven distinct scenarios with short TAI timelines, and ultimately concludes that such timelines are plausible.

Zershaaneh Qureshi

Feb 20, 2025

256

minute read

Pathways to short TAI timelines

Zershaaneh Qureshi

Feb 20, 2025

256

minute read

Pathways to short TAI timelines

Zershaaneh Qureshi

Feb 20, 2025

256

minute read

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

This paper examines the strategic dynamics of international competition to develop Artificial Superintelligence (ASI).

Corin Katzke, Gideon Futerman

Jan 17, 2025

minute read

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

This paper examines the strategic dynamics of international competition to develop Artificial Superintelligence (ASI).

Corin Katzke, Gideon Futerman

Jan 17, 2025

minute read

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

This paper examines the strategic dynamics of international competition to develop Artificial Superintelligence (ASI).

Corin Katzke, Gideon Futerman

Jan 17, 2025

minute read

Scenario Planning

Analysis of Global AI Governance Strategies

We analyze three prominent strategies for governing transformative AI (TAI) development: Cooperative Development, Strategic Advantage, and Global Moratorium. We evaluate these strategies across varying levels of alignment difficulty and development timelines, examining their effectiveness in preventing catastrophic risks while preserving beneficial AI development.

Sammy Martin, Justin Bullock, Corin Katzke

Dec 4, 2024

minute read

Scenario Planning

Analysis of Global AI Governance Strategies

Sammy Martin, Justin Bullock, Corin Katzke

Dec 4, 2024

minute read

Scenario Planning

Analysis of Global AI Governance Strategies

Sammy Martin, Justin Bullock, Corin Katzke

Dec 4, 2024

minute read

Scenario Planning

AI governance needs a theory of victory

A theory of victory for AI governance combines an endgame with a plausible and prescriptive strategy to achieve it. It should also be robust across a range of future scenarios, given uncertainty about key strategic parameters.

Corin Katzke, Justin Bullock

Jun 21, 2024

minute read

Scenario Planning

AI governance needs a theory of victory

Corin Katzke, Justin Bullock

Jun 21, 2024

minute read

Scenario Planning

AI governance needs a theory of victory

Corin Katzke, Justin Bullock

Jun 21, 2024

minute read

Scenario Planning

Investigating the role of agency in AI x-risk

In this post, I examine the nature and role of agency in AI existential risk. As a framework, I use Joseph Carlsmith's power-seeking threat model. It illustrates how agentic AI systems might seek power in unintended ways, leading to existential catastrophe.

Corin Katzke

Apr 13, 2024

minute read

Scenario Planning

Investigating the role of agency in AI x-risk

Corin Katzke

Apr 13, 2024

minute read

Scenario Planning

Investigating the role of agency in AI x-risk

Corin Katzke

Apr 13, 2024

minute read

Scenario Planning

Timelines to Transformative AI: an investigation

The timeline for the arrival of advanced AI is a key consideration for AI safety and governance. It is a critical determinant of the threat models we are likely to face, the magnitude of those threats, and the appropriate strategies for mitigating them.

Zershaaneh Qureshi

Mar 26, 2024

minute read

Scenario Planning

Timelines to Transformative AI: an investigation

Zershaaneh Qureshi

Mar 26, 2024

minute read

Scenario Planning

Timelines to Transformative AI: an investigation

Zershaaneh Qureshi

Mar 26, 2024

minute read

Scenario Planning

Transformative AI and Scenario Planning for AI X-risk

We argue that “Transformative AI” (TAI) is a useful key milestone to consider for AI scenario analysis; it places the focus on the socio-technical impact of AI and is both widely used and well-defined within the existing AI literature.

Dr. Elliot McKernon, Dr. Justin Bullock

Mar 22, 2024

minute read

Scenario Planning

Transformative AI and Scenario Planning for AI X-risk

Dr. Elliot McKernon, Dr. Justin Bullock

Mar 22, 2024

minute read

Scenario Planning

Transformative AI and Scenario Planning for AI X-risk

Dr. Elliot McKernon, Dr. Justin Bullock

Mar 22, 2024

minute read

Scenario Planning

Scenario planning for AI x-risk

In this post it, I’ll motivate and review some methods for applying scenario planning methods to AI x-risk strategy.

Corin Katzke

Feb 10, 2024

minute read

Scenario Planning

Scenario planning for AI x-risk

In this post it, I’ll motivate and review some methods for applying scenario planning methods to AI x-risk strategy.

Corin Katzke

Feb 10, 2024

minute read

Scenario Planning

Scenario planning for AI x-risk

In this post it, I’ll motivate and review some methods for applying scenario planning methods to AI x-risk strategy.

Corin Katzke

Feb 10, 2024

minute read

Governance Research

A research program by Convergence evaluating critical & neglected policy recommendations in AI governance

Training Data Attribution (TDA): Examining Its Adoption & Use Cases

This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for reducing extreme risks from AI. TDA techniques aim to identify training data points that are especially influential on the behavior of specific model outputs.

Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

Jul 4, 2024

minute read

Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

Jul 4, 2024

minute read

Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

Jul 4, 2024

minute read

AI Governance

AI Model Registries: A Foundational Tool for AI Governance

This report proposes the implementation of national registries for frontier AI models as a foundational tool for AI governance. It explores the rationale, design, implementation, and enforcement of such registries, with recommendations on each.

Elliot McKernon, Gwyn Glasser, Deric Cheng, Gillian Hadfield

Oct 4, 2024

minute read

AI Governance

AI Model Registries: A Foundational Tool for AI Governance

Elliot McKernon, Gwyn Glasser, Deric Cheng, Gillian Hadfield

Oct 4, 2024

minute read

AI Governance

AI Model Registries: A Foundational Tool for AI Governance

Elliot McKernon, Gwyn Glasser, Deric Cheng, Gillian Hadfield

Oct 4, 2024

minute read

AI Governance

Soft Nationalization: How the US Government Will Control AI Labs

We are conducting scenario modeling and governance research to describe how upcoming national security concerns will lead to greater public control over frontier AI development.

Deric Cheng, Corin Katzke

Aug 28, 2024

minute read

AI Governance

Soft Nationalization: How the US Government Will Control AI Labs

We are conducting scenario modeling and governance research to describe how upcoming national security concerns will lead to greater public control over frontier AI development.

Deric Cheng, Corin Katzke

Aug 28, 2024

minute read

AI Governance

Soft Nationalization: How the US Government Will Control AI Labs

We are conducting scenario modeling and governance research to describe how upcoming national security concerns will lead to greater public control over frontier AI development.

Deric Cheng, Corin Katzke

Aug 28, 2024

minute read

Regulation Overview

2024 State of the AI Regulatory Landscape

This report provides a comprehensive overview of the current state of AI regulation as of May 2024, focusing on the approaches taken by the United States, China, and the European Union.

Deric Cheng, Elliot McKernon

May 27, 2024

minute read

Regulation Overview

2024 State of the AI Regulatory Landscape

This report provides a comprehensive overview of the current state of AI regulation as of May 2024, focusing on the approaches taken by the United States, China, and the European Union.

Deric Cheng, Elliot McKernon

May 27, 2024

minute read

Regulation Overview

2024 State of the AI Regulatory Landscape

This report provides a comprehensive overview of the current state of AI regulation as of May 2024, focusing on the approaches taken by the United States, China, and the European Union.

Deric Cheng, Elliot McKernon

May 27, 2024

minute read

Forging a New AGI Social Contract

This is the introductory piece for a series of essays written by AI economists, policy researchers, and political thinkers on the topic of a new AGI Social Contract.

Deric Cheng

Apr 14, 2025

minute read

Forging a New AGI Social Contract

This is the introductory piece for a series of essays written by AI economists, policy researchers, and political thinkers on the topic of a new AGI Social Contract.

Deric Cheng

Apr 14, 2025

minute read

Forging a New AGI Social Contract

This is the introductory piece for a series of essays written by AI economists, policy researchers, and political thinkers on the topic of a new AGI Social Contract.

Deric Cheng

Apr 14, 2025

minute read

AI Governance

AI Emergency Preparedness

We examine how the federal government can enhance its AI emergency preparedness: the ability to detect and prepare for time-sensitive national security threats relating to AI.

Akash Wasil, Everett Smith, Corin Katzke, Justin Bullock

Jul 11, 2024

minute read

AI Governance

AI Emergency Preparedness

We examine how the federal government can enhance its AI emergency preparedness: the ability to detect and prepare for time-sensitive national security threats relating to AI.

Akash Wasil, Everett Smith, Corin Katzke, Justin Bullock

Jul 11, 2024

minute read

AI Governance

AI Emergency Preparedness

We examine how the federal government can enhance its AI emergency preparedness: the ability to detect and prepare for time-sensitive national security threats relating to AI.

Akash Wasil, Everett Smith, Corin Katzke, Justin Bullock

Jul 11, 2024

minute read

AI Governance

AI, Global Governance, and Digital Sovereignty

This essay examines how Artificial Intelligence (AI) systems are becoming more integral to international affairs by affecting how global governors exert power and pursue digital sovereignty.

Swati Srivastava, Justin Bullock

Oct 23, 2024

minute read

AI Governance

AI, Global Governance, and Digital Sovereignty

This essay examines how Artificial Intelligence (AI) systems are becoming more integral to international affairs by affecting how global governors exert power and pursue digital sovereignty.

Swati Srivastava, Justin Bullock

Oct 23, 2024

minute read

AI Governance

AI, Global Governance, and Digital Sovereignty

This essay examines how Artificial Intelligence (AI) systems are becoming more integral to international affairs by affecting how global governors exert power and pursue digital sovereignty.

Swati Srivastava, Justin Bullock

Oct 23, 2024

minute read

AI Governance

The brave new world of AI: Implications for public sector agents, organisations, and governance

Justin Bullock, Yu-Che Chen

May 27, 2024

minute read

AI Governance

The brave new world of AI: Implications for public sector agents, organisations, and governance

Justin Bullock, Yu-Che Chen

May 27, 2024

minute read

AI Governance

The brave new world of AI: Implications for public sector agents, organisations, and governance

Justin Bullock, Yu-Che Chen

May 27, 2024

minute read

AI Governance

Aligning AI Safety Projects with a Republican Administration

The upcoming shift from Biden’s administration to a Republican one is likely to necessitate some changes in strategy and framing for AI safety initiatives.

Deric Cheng

Nov 21, 2024

minute read

AI Governance

Aligning AI Safety Projects with a Republican Administration

The upcoming shift from Biden’s administration to a Republican one is likely to necessitate some changes in strategy and framing for AI safety initiatives.

Deric Cheng

Nov 21, 2024

minute read

AI Governance

Aligning AI Safety Projects with a Republican Administration

The upcoming shift from Biden’s administration to a Republican one is likely to necessitate some changes in strategy and framing for AI safety initiatives.

Deric Cheng

Nov 21, 2024

minute read

AI Governance

Evaluating an AI Chip Registration Policy

This report evaluates the feasibility and potential impacts of a US policy requiring the registration and tracking of high-end AI chips.

Deric Cheng

Apr 8, 2024

minute read

AI Governance

Evaluating an AI Chip Registration Policy

This report evaluates the feasibility and potential impacts of a US policy requiring the registration and tracking of high-end AI chips.

Deric Cheng

Apr 8, 2024

minute read

AI Governance

Evaluating an AI Chip Registration Policy

This report evaluates the feasibility and potential impacts of a US policy requiring the registration and tracking of high-end AI chips.

Deric Cheng

Apr 8, 2024

minute read

Regulation Summary

A brief review of China's AI industry and regulations

China has enacted three sets of AI regulations since 2021. I haven’t seen a concise breakdown of their content in one place, and I’ve been researching the legislation for a governance project at Convergence Analysis, so here is my concise summary of what I found. I’ll close each section by quoting some expert opinions on the legislation

Elliot McKernon

Mar 14, 2024

minute read

Regulation Summary

A brief review of China's AI industry and regulations

Elliot McKernon

Mar 14, 2024

minute read

Regulation Summary

A brief review of China's AI industry and regulations

Elliot McKernon

Mar 14, 2024

minute read

Articles & Publications

The Oxford Handbook of AI Governance

The Oxford Handbook of AI Governance brings together a series of experts from a wide set of disciplines, areas of study, and cultural backgrounds to provide a global perspective on AI governance.

Dr. Justin Bullock

Feb 14, 2024

450

minute read

Articles & Publications

The Oxford Handbook of AI Governance

The Oxford Handbook of AI Governance brings together a series of experts from a wide set of disciplines, areas of study, and cultural backgrounds to provide a global perspective on AI governance.

Dr. Justin Bullock

Feb 14, 2024

450

minute read

Articles & Publications

The Oxford Handbook of AI Governance

The Oxford Handbook of AI Governance brings together a series of experts from a wide set of disciplines, areas of study, and cultural backgrounds to provide a global perspective on AI governance.

Dr. Justin Bullock

Feb 14, 2024

450

minute read

Risk Assessments

Frontier AI regulation: Managing emerging risks to public safety

In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety.

Dr. Justin Bullock

Nov 7, 2023

minute read

Risk Assessments

Frontier AI regulation: Managing emerging risks to public safety

In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety.

Dr. Justin Bullock

Nov 7, 2023

minute read

Risk Assessments

Frontier AI regulation: Managing emerging risks to public safety

In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety.

Dr. Justin Bullock

Nov 7, 2023

minute read

Public AI Governance

Artificial intelligence and administrative evil

Artificial intelligence (AI) offers challenges and benefits to the public sector. We present an ethical framework to analyze the effects of AI in public organizations, guide empirical and theoretical research in public administration, and inform practitioner deliberation and decision making on AI adoption.

Dr. Justin Bullock

Apr 9, 2021

minute read

Public AI Governance

Artificial intelligence and administrative evil

Dr. Justin Bullock

Apr 9, 2021

minute read

Public AI Governance

Artificial intelligence and administrative evil

Dr. Justin Bullock

Apr 9, 2021

minute read

Public AI Governance

Artificial intelligence, bureaucratic form, and discretion in public service

This article examines the relationship between Artificial Intelligence (AI), discretion, and bureaucratic form in public organizations.

Dr. Justin Bullock

Dec 4, 2020

minute read

Public AI Governance

Artificial intelligence, bureaucratic form, and discretion in public service

This article examines the relationship between Artificial Intelligence (AI), discretion, and bureaucratic form in public organizations.

Dr. Justin Bullock

Dec 4, 2020

minute read

Public AI Governance

Artificial intelligence, bureaucratic form, and discretion in public service

This article examines the relationship between Artificial Intelligence (AI), discretion, and bureaucratic form in public organizations.

Dr. Justin Bullock

Dec 4, 2020

minute read

Governance Research

Artificial intelligence, discretion, and bureaucracy

This essay highlights the increasing use of artificial intelligence (AI) in governance and society and explores the relationship between AI, discretion, and bureaucracy.

Dr. Justin Bullock

Jun 18, 2019

minute read

Governance Research

Artificial intelligence, discretion, and bureaucracy

This essay highlights the increasing use of artificial intelligence (AI) in governance and society and explores the relationship between AI, discretion, and bureaucracy.

Dr. Justin Bullock

Jun 18, 2019

minute read

Governance Research

Artificial intelligence, discretion, and bureaucracy

This essay highlights the increasing use of artificial intelligence (AI) in governance and society and explores the relationship between AI, discretion, and bureaucracy.

Dr. Justin Bullock

Jun 18, 2019

minute read

Advocacy & Education

As one of our key pillars of output, we’re extremely focused on conducting effective public education and advocacy to broaden the awareness of AI risks and propose realistic recommendations to mitigate them.

Humanist Perspectives

AI’s Moment is Here. Let’s Get It Right

The last year has been a transformative year for AI. ChatGPT was released November 30th, 2022 and sparked a media firestorm. By February of this year, a little over 2 months after launch, ChatGPT had 100 million active users.

Dr. Justin Bullock

Nov 7, 2023

minute read

Humanist Perspectives

AI’s Moment is Here. Let’s Get It Right

Dr. Justin Bullock

Nov 7, 2023

minute read

Humanist Perspectives

AI’s Moment is Here. Let’s Get It Right

Dr. Justin Bullock

Nov 7, 2023

minute read

Humanist Perspectives

Machine Evolution

In this paper, we will examine several stages of various processes of evolution: Natural Selection, Artificial Selection, and Technological Selection.

Dr. Justin Bullock, Dr. Christopher DiCarlo, Dr. Elliot McKernon

Nov 7, 2023

minute read

Humanist Perspectives

Machine Evolution

In this paper, we will examine several stages of various processes of evolution: Natural Selection, Artificial Selection, and Technological Selection.

Dr. Justin Bullock, Dr. Christopher DiCarlo, Dr. Elliot McKernon

Nov 7, 2023

minute read

Humanist Perspectives

Machine Evolution

In this paper, we will examine several stages of various processes of evolution: Natural Selection, Artificial Selection, and Technological Selection.

Dr. Justin Bullock, Dr. Christopher DiCarlo, Dr. Elliot McKernon

Nov 7, 2023

minute read

Humanist Perspectives

AI 101 and the Future of Humanity

It has become increasingly apparent that AI technologies have advanced quite rapidly in the last several years. In fact, it has happened so rapidly, my colleagues and I have been forced to prioritize our research to focus primarily upon the risks and governance of AI as we move into an uncertain future.

Dr. Christopher DiCarlo

Nov 5, 2023

minute read

Humanist Perspectives

AI 101 and the Future of Humanity

Dr. Christopher DiCarlo

Nov 5, 2023

minute read

Humanist Perspectives

AI 101 and the Future of Humanity

Dr. Christopher DiCarlo

Nov 5, 2023

minute read

Humanist Perspectives

The Danger from AI is Real, But We Can Make it Safe

Dr. Elliot McKernon

Jul 11, 2023

minute read

Humanist Perspectives

The Danger from AI is Real, But We Can Make it Safe

Dr. Elliot McKernon

Jul 11, 2023

minute read

Humanist Perspectives

The Danger from AI is Real, But We Can Make it Safe

Dr. Elliot McKernon

Jul 11, 2023

minute read

Articles & Publications

How to Avoid a Robotic Apocalypse

"'How to Avoid a Robotic Apocalypse' delves into the ethical and safety considerations surrounding advanced artificial intelligence. The paper examines the potential for emergent consciousness in AI systems and draws parallels to the Frankenstein myth, where a creator's rejection of their creation leads to disaster.

Dr. Christopher DiCarlo

Dec 19, 2016

minute read

Articles & Publications

How to Avoid a Robotic Apocalypse

Dr. Christopher DiCarlo

Dec 19, 2016

minute read

Articles & Publications

How to Avoid a Robotic Apocalypse

Dr. Christopher DiCarlo

Dec 19, 2016

minute read

Existential Risk Strategy Research

Given a goal of mitigating existential risk from AI development, what is the best approach to achieving this? Strategy research emphasizes impartially evaluating possible high-level courses of conduct considering factors like resource allocation, domain of focus, or means of implementation.

Holistic Strategy Research

Crucial questions for longtermists

The last decade saw substantial growth in the amount of attention, talent, and funding flowing towards existential risk reduction and longtermism. How can we direct these resources in the best way? Why were these resources directed as they were? Are people able to understand and critique the beliefs underlying various views - including their own - regarding how best to put longtermism into practice?

Michael Aird

Jul 29, 2020

minute read

Holistic Strategy Research

Crucial questions for longtermists

Michael Aird

Jul 29, 2020

minute read

Holistic Strategy Research

Crucial questions for longtermists

Michael Aird

Jul 29, 2020

minute read

Risk Modeling

Causal diagrams of the paths to existential catastrophe

In this post, I present causal diagrams capturing some key paths to existential catastrophe, and key types of interventions to prevent such catastrophes.

Michael Aird, Justin Shovelain, David Kristoffersson

Mar 1, 2020

minute read

Risk Modeling

Causal diagrams of the paths to existential catastrophe

In this post, I present causal diagrams capturing some key paths to existential catastrophe, and key types of interventions to prevent such catastrophes.

Michael Aird, Justin Shovelain, David Kristoffersson

Mar 1, 2020

minute read

Risk Modeling

Causal diagrams of the paths to existential catastrophe

In this post, I present causal diagrams capturing some key paths to existential catastrophe, and key types of interventions to prevent such catastrophes.

Michael Aird, Justin Shovelain, David Kristoffersson

Mar 1, 2020

minute read

Risk Modeling

State Space of X-Risk Trajectories

Currently, people tend to use many key concepts informally when reasoning and forming strategies and policies for existential risks (x-risks). We construct a common state space for futures, trajectories, and interventions, and show how these interact.

David Kristoffersson, Justin Shovelain

Feb 6, 2020

minute read

Risk Modeling

State Space of X-Risk Trajectories

David Kristoffersson, Justin Shovelain

Feb 6, 2020

minute read

Risk Modeling

State Space of X-Risk Trajectories

David Kristoffersson, Justin Shovelain

Feb 6, 2020

minute read

Holistic Strategy Research

A case for strategy research: what it is and why we need more of it

To understand how to best allocate our time and resources, we need to clarify what our options in research are. In this article, we describe strategy research and relate it to values research, tactics research, informing research, and improvement research.

Siebe Rozendal, Justin Shovelain, David Kristoffersson

Jun 20, 2019

minute read

Holistic Strategy Research

A case for strategy research: what it is and why we need more of it

Siebe Rozendal, Justin Shovelain, David Kristoffersson

Jun 20, 2019

minute read

Holistic Strategy Research

A case for strategy research: what it is and why we need more of it

Siebe Rozendal, Justin Shovelain, David Kristoffersson

Jun 20, 2019

minute read

Strategy Research Tooling

Four components of strategy research

In this post, we describe and outline ways to decompose strategy research. Specifically, we break it down into the following four components: mapping the space, constructing strategies, modelling causality, and prioritizing between strategies.

Michael Aird, Justin Shovelain, David Kristoffersson, Siebe Rozendal

Jan 31, 2021

minute read

Strategy Research Tooling

Four components of strategy research

Michael Aird, Justin Shovelain, David Kristoffersson, Siebe Rozendal

Jan 31, 2021

minute read

Strategy Research Tooling

Four components of strategy research

Michael Aird, Justin Shovelain, David Kristoffersson, Siebe Rozendal

Jan 31, 2021

minute read

Strategy Research Tooling

Evaluating expertise: a clear box model

To get what we value we must make good decisions. To make these decisions we must know what relevant facts are true. But the world is so complex that we cannot check everything directly ourselves and so must defer to topic “experts” for some things. How should we choose these experts and how much should we believe what they tell us? In this document, I’ll describe a way to evaluate experts.

Justin Shovelain

Oct 15, 2020

minute read

Strategy Research Tooling

Evaluating expertise: a clear box model

Justin Shovelain

Oct 15, 2020

minute read

Strategy Research Tooling

Evaluating expertise: a clear box model

Justin Shovelain

Oct 15, 2020

minute read

Holistic Strategy Research

Crucial questions about optimal timing of work and donations

This post will overview the crucial questions that we (Convergence) believe do or should influence different longtermists’ views and choices regarding the best timing of work and donations.

Michael Aird

Aug 14, 2020

minute read

Holistic Strategy Research

Crucial questions about optimal timing of work and donations

This post will overview the crucial questions that we (Convergence) believe do or should influence different longtermists’ views and choices regarding the best timing of work and donations.

Michael Aird

Aug 14, 2020

minute read

Holistic Strategy Research

Crucial questions about optimal timing of work and donations

This post will overview the crucial questions that we (Convergence) believe do or should influence different longtermists’ views and choices regarding the best timing of work and donations.

Michael Aird

Aug 14, 2020

minute read

Value Modeling

Moral circles: Degrees, dimensions, visuals

A person’s “moral circle” classically refers to the entities that that person perceives as having moral standing, or as being worthy of moral concern. And “moral circle expansion” classically refers to moral circles moving “outwards”, for example from kin to people of other races to nonhuman animals, such that “more distant” entities are now in one’s “circle of concern”.

Michael Aird

Jul 24, 2020

minute read

Value Modeling

Moral circles: Degrees, dimensions, visuals

Michael Aird

Jul 24, 2020

minute read

Value Modeling

Moral circles: Degrees, dimensions, visuals

Michael Aird

Jul 24, 2020

minute read

Strategy Research Tooling

Improving the future by influencing actors' benevolence, intelligence, and power

This post argues that one useful way to come up with, and assess the expected value of, actions to improve the long-term future is to consider how “benevolent”, “intelligent”, and “powerful” various actors are, and how various actions could affect those actors’ benevolence, intelligence, and power.

Michael Aird, Justin Shovelain

Jul 20, 2020

minute read

Strategy Research Tooling

Improving the future by influencing actors' benevolence, intelligence, and power

Michael Aird, Justin Shovelain

Jul 20, 2020

minute read

Strategy Research Tooling

Improving the future by influencing actors' benevolence, intelligence, and power

Michael Aird, Justin Shovelain

Jul 20, 2020

minute read

Value Modeling

Value uncertainty

We are often forced to make decisions under conditions of uncertainty. This may be empirical uncertainty, or it may be moral uncertainty. But what if you don’t believe that “morally important” is a coherent concept?

Michael Aird

Jun 30, 2020

minute read

Value Modeling

Value uncertainty

Michael Aird

Jun 30, 2020

minute read

Value Modeling

Value uncertainty

Michael Aird

Jun 30, 2020

minute read

Mitigating Existential Risks

Differential progress / intellectual progress / technological development

In 2002, Nick Bostrom introduced the concept of differential technological development. This concept is highly relevant to many efforts to do good in the world (particularly, but not only, from the perspective of reducing existential risks). Other writers have generalised Bostrom’s concept, using the terms “differential intellectual progress” or just “differential progress”. I think these generalisations are actually even more useful.

Michael Aird

Apr 24, 2020

minute read

Mitigating Existential Risks

Differential progress / intellectual progress / technological development

Michael Aird

Apr 24, 2020

minute read

Mitigating Existential Risks

Differential progress / intellectual progress / technological development

Michael Aird

Apr 24, 2020

minute read

Risk Modeling

Clarifying existential risks and existential catastrophes

Existential risks are considered by many to be among the most pressing issues of our time (see e.g. The Precipice). But what, precisely, do we mean by “existential risks”?

Michael Aird

Apr 24, 2020

minute read

Risk Modeling

Clarifying existential risks and existential catastrophes

Existential risks are considered by many to be among the most pressing issues of our time (see e.g. The Precipice). But what, precisely, do we mean by “existential risks”?

Michael Aird

Apr 24, 2020

minute read

Risk Modeling

Clarifying existential risks and existential catastrophes

Existential risks are considered by many to be among the most pressing issues of our time (see e.g. The Precipice). But what, precisely, do we mean by “existential risks”?

Michael Aird

Apr 24, 2020

minute read

Risk Forecasting

Database of existential risk estimates

This post provides spreadsheet you can use for making your own estimates of existential risks (or of similarly “extreme” outcomes), and announces a database of estimates of existential risk (or similarly extreme outcomes), which I hope can be collaboratively expanded and updated. Finally, it discusses why I think this database may be valuable discusses some pros and cons of using or making such estimates.

Michael Aird

Apr 15, 2020

minute read

Risk Forecasting

Database of existential risk estimates

Michael Aird

Apr 15, 2020

minute read

Risk Forecasting

Database of existential risk estimates

Michael Aird

Apr 15, 2020

minute read

Risk Forecasting

Some thoughts on Toby Ord’s existential risk estimates

Toby Ord’s The Precipice is an ambitious and excellent book. Among many other things, Ord attempts to survey the entire landscape of existential risks[1] humanity faces. In this post, I will discuss whether Ord may understate the uncertainty of these estimates, and ambiguity about what he’s actually estimating when he estimates the risk from “unaligned AI”.

Michael Aird

Apr 7, 2020

minute read

Risk Forecasting

Some thoughts on Toby Ord’s existential risk estimates

Michael Aird

Apr 7, 2020

minute read

Risk Forecasting

Some thoughts on Toby Ord’s existential risk estimates

Michael Aird

Apr 7, 2020

minute read

Holistic Strategy Research

The ‘far future’ is not just the far future

It’s a widely held belief in the existential risk reduction community that we are likely to see a great technological transformation in the next 50 years. A technological transformation will either cause flourishing, existential catastrophe, or other forms of large change for humanity.

David Kristoffersson

Jan 16, 2020

minute read

Holistic Strategy Research

The ‘far future’ is not just the far future

David Kristoffersson

Jan 16, 2020

minute read

Holistic Strategy Research

The ‘far future’ is not just the far future

David Kristoffersson

Jan 16, 2020

minute read

Strategy Research Tooling

Moloch and the Pareto optimal frontier

Moloch is a poetic way of describing failures of coordination and coherence inside an agent or between agents and the generation of harmful subcomponents or harmful agents. Perhaps this could be decomposed further, or at least partially covered, by randomly generated accidents, Goodhart’s law failures, and conflicts of optimization.

Justin Shovelain

Jan 14, 2020

minute read

Strategy Research Tooling

Moloch and the Pareto optimal frontier

Justin Shovelain

Jan 14, 2020

minute read

Strategy Research Tooling

Moloch and the Pareto optimal frontier

Justin Shovelain

Jan 14, 2020

minute read

Information Hazards & Downside Risks

How can we understand the risks of disseminating crucial information, such as the blueprint for thermonuclear weapons or the genetic sequence of a lethal pathogen? What are the potential negative externalities of well-intentioned actions? We’ve explored these topics in depth at Convergence.

Information Hazards

Information hazards: Why you should care and what you can do

We argue that many people should consider the risk that they could cause harm by developing or sharing (true) information. We think that harm from such information hazards may sometimes be very substantial, and that this applies especially to people who research advanced technologies and/or catastrophic risks, or who often think about such technologies and risks.

Michael Aird, Justin Shovelain, David Kristoffersson, algekalipso

Feb 24, 2020

minute read

Information Hazards

Information hazards: Why you should care and what you can do

Michael Aird, Justin Shovelain, David Kristoffersson, algekalipso

Feb 24, 2020

minute read

Information Hazards

Information hazards: Why you should care and what you can do

Michael Aird, Justin Shovelain, David Kristoffersson, algekalipso

Feb 24, 2020

minute read

Information Hazards

Mapping downside risks and information hazards

Many altruistic actions have downside risks; they might turn out to have negative effects, or to even be negative overall. Perhaps, for example, the biosecurity research you might do could pose information hazards, the article you might write could pose memetic downside risks, or that project you might start could divert resources (such as money or attention) from more valuable things.

Michael Aird, Justin Shovelain, David Kristoffersson

Feb 20, 2020

minute read

Information Hazards

Mapping downside risks and information hazards

Michael Aird, Justin Shovelain, David Kristoffersson

Feb 20, 2020

minute read

Information Hazards

Mapping downside risks and information hazards

Michael Aird, Justin Shovelain, David Kristoffersson

Feb 20, 2020

minute read

Downside Risks

Good and bad ways to think about downside risks

Many actions we could take to make the world better might also have negative effects, or might even be negative overall. In other words, altruistic actions often have downside risks.

Michael Aird, Justin Shovelain

Jun 11, 2020

minute read

Downside Risks

Good and bad ways to think about downside risks

Many actions we could take to make the world better might also have negative effects, or might even be negative overall. In other words, altruistic actions often have downside risks.

Michael Aird, Justin Shovelain

Jun 11, 2020

minute read

Downside Risks

Good and bad ways to think about downside risks

Many actions we could take to make the world better might also have negative effects, or might even be negative overall. In other words, altruistic actions often have downside risks.

Michael Aird, Justin Shovelain

Jun 11, 2020

minute read

Downside Risks

Memetic downside risks: How ideas can evolve and cause harm

We introduce the concept of memetic downside risks (MDR): risks of unintended negative effects that arise from how ideas “evolve” over time (as a result of replication, mutation, and selection). We discuss how this concept relates to the existing concepts of memetics, downside risks, and information hazards.

Michael Aird, Justin Shovelain, algekalipso

Feb 26, 2020

minute read

Downside Risks

Memetic downside risks: How ideas can evolve and cause harm

Michael Aird, Justin Shovelain, algekalipso

Feb 26, 2020

minute read

Downside Risks

Memetic downside risks: How ideas can evolve and cause harm

Michael Aird, Justin Shovelain, algekalipso

Feb 26, 2020

minute read

Information Hazards

What are information hazards?

The concept of information hazards is highly relevant to many efforts to do good in the world (particularly, but not only, from the perspective of reducing existential risks). I’m thus glad that many effective altruists and rationalists seem to know of, and refer to, this concept. However, it also seems that

Michael Aird

Feb 19, 2020

minute read

Information Hazards

What are information hazards?

Michael Aird

Feb 19, 2020

minute read

Information Hazards

What are information hazards?

Michael Aird

Feb 19, 2020

minute read

AI Safety

What are the most effective strategies for achieving safe and aligned AI in the near future? We evaluate different techniques to measure and improve AI safety such as value modeling, AI alignment techniques, and governance strategies.

Technical AI Safety

Information-Theoretic Boxing of Superintelligences

Boxing an agent more intelligent than ourselves is daunting, but information theory, thermodynamics, and control theory provide us with tools that can fundamentally constrain agents independent of their intelligence. In particular, we may be able to contain an AI by limiting its access to information

Justin Shovelain, Elliot McKernon

Nov 30, 2023

minute read

Technical AI Safety

Information-Theoretic Boxing of Superintelligences

Justin Shovelain, Elliot McKernon

Nov 30, 2023

minute read

Value Modeling

Aligning AI by optimizing for "wisdom"

In this post, we’ll introduce wisdom as a measure of the benevolence and internal coherence of an arbitrary agent. We’ll define several factors, such as the agent’s values, plans, evidence, and alignment with human values, and then define wisdom as consistency within and between these factors.

Justin Shovelain, Elliot McKernon

Jun 27, 2023

minute read

Value Modeling

Aligning AI by optimizing for "wisdom"

Justin Shovelain, Elliot McKernon

Jun 27, 2023

minute read

Mitigating Existential Risks

Improving the safety of AI evals

Many organizations are developing and using AI evaluations, “evals”, to assess the capability, alignment, and safety of AI. However, evals are not entirely innocuous, and we believe the risks they pose are neglected. In this article, we’ll outline some of the risks posed by doing AI evals, and suggest a strategy to improve their safety.

Justin Shovelain, Elliot McKernon

Jun 18, 2023

minute read

Mitigating Existential Risks

Improving the safety of AI evals

Justin Shovelain, Elliot McKernon

Jun 18, 2023

minute read

Mitigating Existential Risks

Improving the safety of AI evals

Justin Shovelain, Elliot McKernon

Jun 18, 2023

minute read

Technical AI Safety

The risk-reward tradeoff of interpretability research

Interpretability research is conducted to improve our understanding of AI. Many see interpretability as essential for AI safety, but recently some have argued that it can also increase the risk posed by AI by facilitating improved AI capabilities. We agree, and in this post, we’ll explain why, as well as how risks can be reduced.

Justin Shovelain, Elliot McKernon

Jun 5, 2023

minute read

Technical AI Safety

The risk-reward tradeoff of interpretability research

Justin Shovelain, Elliot McKernon

Jun 5, 2023

minute read

Technical AI Safety

The risk-reward tradeoff of interpretability research

Justin Shovelain, Elliot McKernon

Jun 5, 2023

minute read

Mitigating Existential Risks

Keep humans in the loop

In this post, we’ll explore how systems that are populated by people have some protection against emergent coordination failures in society. We’ll argue that if people are removed from the system or disempowered, technological development will drift away from improving human welfare, and coordination failures will worsen. We’ll then focus on AI development, exploring some practical techniques for AI development that keeps humans in the loop.

Justin Shovelain, Elliot McKernon

Apr 19, 2023

minute read

Mitigating Existential Risks

Keep humans in the loop

Justin Shovelain, Elliot McKernon

Apr 19, 2023

minute read

Mitigating Existential Risks

Keep humans in the loop

Justin Shovelain, Elliot McKernon

Apr 19, 2023

minute read

Value Modeling

Updating Utility Functions

This post will be about AIs that “refine” their utility function over time, and how it might be possible to construct such systems without giving them undesirable properties. The discussion relates to corrigibility, value learning, and (to a lesser extent) wireheading.

Justin Shovelain, Joar Skalse

May 9, 2022

minute read

Value Modeling

Updating Utility Functions

Justin Shovelain, Joar Skalse

May 9, 2022

minute read

Value Modeling

Updating Utility Functions

Justin Shovelain, Joar Skalse

May 9, 2022

minute read

Technical AI Safety

Goodhart's Law Causal Diagrams

Goodhart's law is closely related to both inner and outer alignment problems and the principal agent problem. Understanding it better should help us solve these problems. This post is an attempt to work out a more precise, unified way of thinking about part of Goodhart’s law.

Justin Shovelain, Jeremy Gillen

Apr 11, 2022

minute read

Technical AI Safety

Goodhart's Law Causal Diagrams

Justin Shovelain, Jeremy Gillen

Apr 11, 2022

minute read

Technical AI Safety

Goodhart's Law Causal Diagrams

Justin Shovelain, Jeremy Gillen

Apr 11, 2022

minute read

Value Modeling

Using vector fields to visualise preferences and make them consistent

This post outlines what vector fields are, how they can be used to visualise preferences, how utility functions can be generated from “preference vector fields” (PVFs), how PVFs can be extrapolated from limited data on preferences, how to visualize inconsistent preferences (as “curl”), and rough idea for how to “remove curl” to generate consistent utility functions.

Michael Aird, Justin Shovelain

Jan 29, 2020

minute read

Value Modeling

Using vector fields to visualise preferences and make them consistent

Michael Aird, Justin Shovelain

Jan 29, 2020

minute read

Value Modeling

Using vector fields to visualise preferences and make them consistent

Michael Aird, Justin Shovelain

Jan 29, 2020

minute read

Mitigating Existential Risks

AI alignment concepts: philosophical breakers, stoppers, and distorters

When thinking about philosophy one may encounter philosophical breakers, philosophical stoppers, and philosophical distorters; thoughts or ideas that cause an agent (such as an AI) to break, get stuck, or take a random action. They are philosophical crises for that agent (and can in theory sometimes be information hazards).

Justin Shovelain

Jan 25, 2020

minute read

Mitigating Existential Risks

AI alignment concepts: philosophical breakers, stoppers, and distorters

Justin Shovelain

Jan 25, 2020

minute read

Mitigating Existential Risks

AI alignment concepts: philosophical breakers, stoppers, and distorters

Justin Shovelain

Jan 25, 2020

minute read

Mitigating Existential Risks

Safety regulators: A tool for mitigating technological risk

As technology improves, our capacity to do both harm and good increases and each additional capacity unlocks new capacities that can be implemented. For example the invention of engines unlocked railroads, which in turn unlocked more efficient trade networks. However, the invention of engines also enabled the construction of mobile war vehicles. How, in an ideal world, could we implement capacities so we get the outcomes we want while creating minimal harm and risks in the process?

Justin Shovelain

Jan 21, 2020

minute read

Mitigating Existential Risks