We use a multi-level security system to limit DALL·E 3’s ability to generate potentially harmful images, including violent, adult, or hateful content. Security checks are performed on user input and the resulting images before they are displayed to users. We also worked with early adopters and expert red teams to identify and address gaps in the coverage of our security systems that emerged with the capabilities of the new models. For example, feedback helped us identify edge cases for generating graphic content, such as sexual images, and test the model’s ability to generate convincingly misleading images.
As part of the work done to prepare DALL·E 3 for deployment, we have also taken steps to limit the model’s likelihood of generating living artist-style content, images of public figures, and improve demographic representation in the generated images. To read more about the work done to prepare DALL·E 3 for wide deployment, see the DALL·E 3 system card.
User feedback will help us ensure we continue to improve. ChatGPT users can share feedback with our research team by using the flag icon to inform us of unsafe results or results that do not accurately reflect the message you gave to ChatGPT. Listening to a broad, diverse community of users and having real-world understanding is critical to developing and deploying ai responsibly and is central to our mission.
We are researching and evaluating an initial version of a provenance classifier, a new internal tool that can help us identify whether or not an image was generated by DALL·E 3. In early internal evaluations, it has an accuracy greater than 99% compared to time to identify if an image was generated by DALL·E when the image has not been modified. It is still more than 95% accurate when the image has been subjected to common types of modifications, such as cropping, resizing, JPEG compression, or when text or real image croppings are superimposed on small portions of the generated image. Despite these strong results in internal testing, the classifier can only tell us that DALL·E has likely generated an image and does not yet allow us to draw definitive conclusions. This provenance classifier can be part of a variety of techniques to help people understand whether audio or visual content is generated by ai. It’s a challenge that will require collaboration across the ai value chain, including the platforms that distribute content to users. We hope to learn a lot about how this tool works and where it might be most useful, and improve our approach over time.