Convergence aims to build foundational theory for existential risk strategy.
A Case For Strategy Research: What it is and why we need more of it (20th Jun 2019)
Siebe Rozendal, Justin Shovelain, David Kristoffersson
To achieve any ambitious goal, some strategic analysis is necessary. Effective altruism has ambitious goals and focuses heavily on doing research. To understand how to best allocate our time and resources, we need to clarify what our options in research are. In this article, we describe strategy research and relate it to values research, tactics research, informing research, and improvement research. We then apply the lens of strategy research to existential risk reduction, a major cause area of effective altruism. We propose a model in which the marginal value of a research type depends strongly on the maturity of the research field. Finally, we argue that strategy research should currently be given higher priority than other research in existential risk reduction because of the significant amount of strategic uncertainty, and we provide specific recommendations for different actors.
Causal diagrams of the paths to existential catastrophe (1st March 2020)
Michael Aird, Justin Shovelain, David Kristoffersson
We present causal diagrams capturing some key paths to existential catastrophe, and key types of interventions to prevent such catastrophes. This post aims to:
- Explicitly, visually represent such paths and interventions alongside each other, in order to facilitate thought and communication
- Highlight certain paths and intervention which some readers may have previously neglected
- Serve as a starting point or inspiration for extending these causal diagrams, adapting the diagrams for specific existential risks and interventions, and building entirely new diagrams
Using vector fields to visualise preferences and make them consistent (28th Jan 2020)
Michael Aird, Justin Shovelain
This post outlines:
- What vector fields are
- How they can be used to visualise preferences
- How utility functions can be generated from “preference vector fields” (PVFs)
- How PVFs can be extrapolated from limited data on preferences
- How to visualise inconsistent preferences (as “curl”)
- A rough idea for how to “remove curl” to generate consistent utility functions
- Possible areas for future research
We expect this to provide useful tools and insights for various purposes, most notably AI alignment, existential risk strategy, and rationality.
Research project highlights
Convergence has a range of different research projects. We will highlight a few here.
1. Information hazard policy
Potent research and development, such as research into AI, biotech, or x-risks, have risks of negative direct and indirect effects. For example, the genetic code to a dangerous virus could lead to harm if it is available publicly. There is no commonly agreed upon standard or baseline for handling technological info hazards. How are people supposed to handle information that could be harmful? With the right thinking tools and policies, we could make research less risky and more beneficial. We’d like to have clear guidance for researchers on how to think about information that could be harmful. We want to facilitate the development of actionable information hazard policies in collaboration with researchers at other existential risk research groups.
2. Shaping projects for long-term good
Existential risk reduction is a new field. There is currently no general methodology available to help determine the long-term effects of projects. How can you know what research into technology like AI or biotech, or into existential risk, will lead to positive or negative effects on humanity in the future? How do you understand the consequences? The side effects? We often use informal, intuitive, and incomplete reasoning to grapple with this kind of question. With the right set of practical and systematic heuristics, we could better shape our projects to cause more good in the long-term. We want to develop a set of decision guidance tools to help shape projects to be more long-term good.
3. AGI trajectories
Choosing the right path for the development of AGI hinges on understanding where we are and where we want to be heading. We currently reason and form strategies and policies for AGI using many key concepts informally. A well-defined formalization and graphical language would let us pin down more exactly what we think and more easily let us see relations and differences between different paths and choices. We construct a common state space for futures, trajectories, and interventions, and show how these interact. The space gives us a precise language for reasoning and communicating about the trajectory of humanity and how different decisions may affect it. We simulate AGI R&D as a stochastic problem solving process to determine the speed of trajectories and propose to use the framework to find points of leverage for interventions.
Some other projects of ours include: the AI timelines simulator, a “technical definition of wisdom”, a systematization of Goodhart’s law, the “Three filter model of AI development”, “strategy variables”, and many other concepts and models in different stages of development.