Prompt-to-Prompt Image Editing with Cross-Attention Control
If you like our work, please consider supporting us so we can keep doing what we do. And as a current subscriber, enjoy this nice discount!
Also: if you haven’t yet, follow us on Twitter, TikTok, or YouTube!
Text-based image synthesis models are appealing to humans because they can verbally describe their intent. However, these models are challenging to edit because a small modification of the text prompt often leads to a completely different outcome. Editing is challenging for these models because an innate property of an editing technique is to preserve most of the original image, but in the text-based models, even a small modification of the text often leads to a completely different outcome. One way to preserve that is by providing a spatial mask to localize the edit, but that ignores the original structure and content within the masked region
The author presents a method for editing images that do not require a mask and demonstrate how this method can be used to edit images by replacing or adding words to the text prompt.
Research Paper:
GitHub code:
Do you like our work?
Consider becoming a paying subscriber to support us!
No spam, no sharing to third party. Only you and me.