StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Yupeng Zhou1*    Daquan Zhou2    Mingming Cheng1    Jiashi Feng2    Qibin Hou1
1VCIP, CS, Nankai University    2ByteDance Inc   
* Interns in ByteDance Inc      Corresponding Authors; Project Lead;
Demo Video

StoryDiffusion can create Magic Story, achieving Long-Range Image and Video Generation!

Comics Generation

StoryDiffusion creates comics in various styles through the proposed consistent self-attention, maintaining consistent character styles and attires for cohesive storytelling.


Long Video Generation
StoryDiffusion can generate high-quality video by our image semantic motion predictor with our generated consistent images or user-input images as the condition.
Long Video Gallery
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using Condition images from SORA
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Video Clips Gallery
we also create creative video clips to better show the our MotionPredictor's performance.
Undersea scene
Undersea scene
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using images generated by our consistent self-attention
Using Condition images from SORA
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Using User-Input Condition images
Cartoon characters generation

StoryDiffusion can create amazing consistent cartoon-style characters.



Multiple Characters Generation

StoryDiffusion can maintain the identity of multiple characters at the same time and generate consistent characters in a sequence of images.



More Comic Generation Example

StoryDiffusion can create impressive comics. We will add more comics and put on here.


"Girl and Squirrel"

Methods
The structure of the Consistent Self-Attention.

The structure of the Motion Predictor.