MobileDiffusion is a novel approach designed for rapid text-to-image generation on mobile devices. It is an efficient latent diffusion model tailored specifically for mobile deployment, featuring a text encoder, diffusion UNet, and image decoder components. MobileDiffusion leverages DiffusionGAN for one-step sampling during inference, allowing for the generation of high-quality images in just half a second on premium iOS and Android devices. With a compact size of 520 million parameters, MobileDiffusion's efficiency in terms of latency and size makes it a promising option for on-device image generation while adhering to responsible AI practices.
MobileDiffusion was created by a team that includes Zhisheng Xiao, Yanwu Xu, Jiuqiang Tang, Haolin Jia, Lutz Justen, Daniel Fenner, Ronald Wotzlaw, Jianing Wei, Raman Sarokin, Juhyun Lee, Andrei Kulik, Chuo-Ling Chang, and Matthias Grundmann. The company focused on developing an efficient latent diffusion model specifically designed for mobile devices, aiming to enable rapid text-to-image generation on mobile devices with a small model size of 520M parameters.
To use MobileDiffusion for sub-second text-to-image generation on mobile devices, follow these steps:
Model Components:
Diffusion UNet:
One-Step Sampling:
Training Procedure:
Image Generation:
Performance Evaluation:
Results:
By following these steps, users can leverage MobileDiffusion to efficiently generate high-quality images from text prompts on mobile devices.
No reviews found!