Planning for AGI and beyond

Our mission is to ensure that artificial general intelligence (AI systems that are generally smarter than humans) benefits all of humanity.

If AGI is successfully created, this technology could help us uplift humanity by increasing abundance, boosting the global economy, and aiding in the discovery of new scientific insights that shift the boundaries of possibility.

AGI has the potential to bring incredible new capabilities to everyone; We can imagine a world in which we all have access to help with almost any cognitive task, providing a huge force multiplier for human ingenuity and creativity.

On the other hand, AGI would also carry a serious risk of misuse, drastic accidents, and social disruption. Because AGI’s advantage is so great, we don’t think it’s possible or desirable for the company to halt its development forever; instead, AGI society and developers have to figure out how to get it right.

While we can’t exactly predict what will happen, and of course our current progress could hit a wall, we can articulate the principles that matter most to us:

We want AGI to empower humanity to flourish to its fullest in the universe. We don’t expect the future to be an absolute utopia, but we do want to maximize the good and minimize the bad, and for AGI to be an amplifier of humanity.
We want AGI’s benefits, access and governance to be shared broadly and fairly.
We want to successfully navigate massive risks. In addressing these risks, we recognize that what seems right in theory often turns out to be stranger than expected in practice. We believe we have to continually learn and adapt by implementing less powerful versions of technology to minimize “one chance to get it right” scenarios.

The short term

There are several things we think are important to do now to prepare for AGI.

First, as we build more and more powerful systems, we want to implement them and gain experience operating them in the real world. We think this is the best way to carefully manage AGI for it to exist: a gradual transition to an AGI world is better than a sudden transition. We expect the powerful AI to make the rate of progress in the world much faster, and we think it’s best to adapt to this incrementally.

A gradual transition gives people, policymakers, and institutions time to understand what is happening, experience firsthand the benefits and drawbacks of these systems, adapt our economy, and establish regulations. It also allows society and AI to co-evolve, and for people to collectively figure out what they want while keeping the stakes relatively low.

We currently believe that the best way to successfully overcome AI implementation challenges is with a tight feedback loop of quick learning and careful iteration. Society will face important questions about what AI systems are allowed to do, how to combat bias, how to deal with job displacement, and more. The optimal decisions will depend on the path the technology takes, and like any new field, most of the expert predictions have been wrong so far. This makes planning in a vacuum very difficult.

Generally speaking, we believe that more use of AI in the world will lead to something good, and we want to promote that (by putting models in our API, opening them, etc.). We believe that democratized access will also lead to more and better research, decentralized power, more profits, and a broader pool of people contributing new ideas.

As our systems move closer to AGI, we are becoming more cautious with the creation and deployment of our models. Our decisions will require much more caution than society tends to apply to new technologies, and more caution than many users would like. Some people in the AI field think that the risks of AGI (and successor systems) are fictitious; we will be delighted that they are right, but we are going to act as if these risks were existential.

As our systems move closer to AGI, we are becoming more cautious with the creation and deployment of our models.

At some point, the balance between the trade-offs of deployments—such as empowering malicious actors, creating social and economic disruption, and accelerating an insecure race—could change, in which case we would significantly change our plans around continued deployment.

Second, we are working to create increasingly aligned and orientable models. Our change of models as the first version from GPT-3 to InstructGPT and ChatGPT is an early example of this.

In particular, we think it’s important that society agree to the extremely broad limits of how AI can be used, but that within those limits, individual users have a lot of discretion. Our ultimate hope is that the world’s institutions agree on what these broad limits should be; in the short term, we plan to conduct experiments to obtain external information. The world’s institutions will need to be strengthened with additional capabilities and experience to be prepared for complex decisions on AGI.

The “default settings” of our products are likely to be quite restricted, but we plan to make it easy for users to change the behavior of the AI they are using. We believe in empowering people to make their own decisions and in the inherent power of diversity of ideas.

We will need to develop new alignment techniques as our models become more powerful (and tests to understand when our current techniques fail). Our short term plan is to use AI to help humans evaluate the results of more complex models and monitor complex systems, and longer term to use AI to help us generate new ideas for better alignment techniques.

Importantly, we believe we often need to advance security and AI capabilities together. It is a false dichotomy to talk about them separately; they are correlated in many ways. Our best security work comes from working with our most capable role models. That being said, it is important that the ratio of security progress to capacity progress increases.

Third, we look forward to a global conversation on three key questions: how to govern these systems, how to fairly distribute the benefits they generate, and how to fairly share access.

In addition to these three areas, we have tried to set up our structure in a way that aligns our incentives with a good result. We have a clause in our Charter about helping other organizations promote safety rather than competing with them in the late stage of AGI’s development. We have a cap on the returns that our shareholders can earn so that we are not incentivized to try to capture value without limits and risk implementing something potentially catastrophically dangerous (and, of course, as a way to share the benefits with society). We have a non-profit organization that governs us and allows us to operate for the good of humanity (and can override any for-profit interest), including allowing us to do things like pay off our capital obligations to shareholders if necessary for security and sponsor the most comprehensive UBI in the world. experiment.

We have tried to set up our structure in a way that aligns our incentives with a good result.

We think it’s important that efforts like ours be independently audited before launching new systems; We will talk about this in more detail later this year. At some point, it may be important to get an independent review before starting to train future systems, and for more advanced efforts to agree to limit the growth rate of the computation used to create new models. We believe public standards for when an AGI effort should stop a training run, decide if it’s safe to release a model, or retire a model from production use are important. Finally, we believe it is important that the world’s leading governments have knowledge about training races above a certain scale.

the long-term

We believe that the future of humanity should be determined by humanity and that it is important to share information about progress with the public. There must be great scrutiny of all efforts trying to build AGI and public consultation for important decisions.

The first AGI will be just a point along the intelligence continuum. We think progress is likely to continue from there, possibly maintaining the rate of progress we’ve seen over the past decade for a long period of time. If this is true, the world could become vastly different from what it is today, and the risks could be extraordinary. A misaligned super-intelligent AGI could do serious damage to the world; an autocratic regime with decisive superintelligence might as well.

AI that can speed up science is a special case worth thinking about, and perhaps more shocking than anything else. It’s possible that AGI capable enough of speeding up its own progress could cause major changes to happen surprisingly quickly (and even if the transition starts slowly, we expect it to happen quite quickly in the later stages). We believe that a slower takeoff is easier to make safe, and coordination between AGI’s efforts to slow down at critical points will likely be important (even in a world where we don’t need to do this to resolve technical alignment issues, slowing down may be important to give society enough time to adjust).

The successful transition to a super-intelligence world is perhaps the most important, hopeful, and terrifying project in human history. Success is far from guaranteed, and hopefully the stakes (unlimited downsides and unlimited upsides) will bring us all together.

We can envision a world in which humanity flourishes to a degree that is probably impossible for any of us yet to fully envision. We hope to bring to the world an AGI aligned with such flourishing.