Depth2Image
Flawless image editing with Stable Diffusion 2.0
About Depth2Image
The Depth2Image model included in Stable Diffusion 2.0 is another step towards flawless image editing.
This model is conditioned on monocular depth estimates inferred via MiDaS (https://github.com/isl-org/MiDaS) and can be used for structure-preserving img2img and shape-conditional synthesis.
- Take a reference image
- Tell the model what you'd like to see instead
- Diffuse!
Instead of just adding noise to the original image and guiding the new image towards your prompt, depth-to-image estimates the depth of the picture first to create a "depth-mask".
This depth-mask is then used as a mask so that structure of the image are well preserved.
Depth2Image can offer all sorts of new creative applications, delivering transformations that look radically different from the original but which still preserve the coherence and depth of that image.
Sources:
- https://www.linkedin.com/posts/patrick-von-platen-343401123_stability-ais-depth2image-model-is-another-activity-7009197545977438209-ErNY
- https://twitter.com/i/status/1595592249550966785