Stable diffusion with Core ML on Apple Silicon

Today, we’re excited to release optimizations for Core ML to stable diffusion on macOS 13.1 and iOS 16.2, along with code to start deploying to Apple Silicon devices.

Figure 1: Images generated with the prompts, “a high-quality photo of an astronaut riding a (horse/dragon) in space” using Stable Diffusion and Core ML+ diffusers running on-device on Apple Silicon.

Since its public debut in August 2022, Stable Diffusion has been embraced by a vibrant community of artists, developers, and hobbyists alike, enabling unprecedented visual content creation with just a text message. In response, the community has built an expansive ecosystem of extensions and tools around this core technology in a matter of weeks. There are already methods that personalize Stable diffusion, spread it to languages other than Englishand more, thanks to open source projects like hugging face diffusers.

Beyond generating images from text cues, developers are also discovering other creative uses for Stable Diffusion, such as image editing, internal painting, external painting, super resolution, style transfer, and even Generation of color palettes. With the growing number of Stable Diffusion apps, ensuring that developers can take advantage of this technology effectively is important to building apps that creatives around the world can use.

One of the key questions for Stable Diffusion in any application is where the model is running. There are several reasons why on-device implementation of Stable Diffusion in an application is preferable to a server-based approach. First, the end user’s privacy is protected because any data the user provided as input to the model remains on the user’s device. Second, after the initial download, users do not need an Internet connection to use the template. Finally, local implementation of this model allows developers to reduce or eliminate server-related costs.

Arriving at a convincing result with Stable Diffusion can take a lot of time and iteration, so a central challenge with on-device deployment of the model is making sure you can generate results fast enough on the device. This requires running a complex pipeline that comprises 4 different neural networks with a total of approximately 1.275 million parameters. For more information on how we optimized a model of this size and complexity to run on Apple’s Neural Engine, you can refer to our previous article on Implementing Transformers in Apple’s Neural Engine. The optimization principles described in the article generalize to Stable Diffusion even though it is 19 times larger than the model studied in the previous article. Core ML’s optimization for Stable Diffusion and simplification of model conversion make it easy for developers to incorporate this technology into their apps in a way that is privacy-preserving and cost-effective, while getting the best performance on Apple Silicon .

This release includes a Python package for converting PyTorch Stable Diffusion models to Core ML using diffusers and coremltools, as well as a Swift package for deploying the models. To get started, visit the Stable Diffusion Core ML code repository for detailed instructions on benchmarking and implementation.