Our approach to AI security

We believe that a practical approach to solving AI security problems is to spend more time and resources researching effective mitigations and alignment techniques and testing them against abuse in the real world.

Importantly, we also believe that improving security and AI capabilities should go hand in hand. Our best security work to date has come from working with our most capable models because they are better at following user instructions and easier to steer or “guide.”

We will be increasingly cautious about building and deploying more capable models, and will continue to enhance security precautions as our AI systems evolve.

While we wait more than 6 months to implement GPT-4 to better understand its capabilities, benefits, and risks, sometimes it may be necessary to take more time to improve the security of AI systems. Therefore, policymakers and AI providers will need to ensure that AI development and deployment are governed effectively on a global scale, so that no one takes shortcuts to get ahead. This is a daunting challenge that requires technical and institutional innovation, but we are eager to contribute.

Addressing security issues also requires extensive discussion, experimentation, and engagement, even at the limits of AI system behavior. We have encouraged and will continue to encourage collaboration and open dialogue among stakeholders to create a secure AI ecosystem.