High-Fidelity Temporally Coherent Video Editing
ByteDance Inc.
  *Equal Contribution
MagicEdit explicitly disentangles the learning of appearance and motion to achieve high-fidelity and temporally coherent video editing.
It supports various editing applications, including video stylization, local editing, video-MagicMix and video outpainting.
Video stylization
Video stylization enables one to (1) transform the source video into a new video with a style-of-
interest (e.g., realistic, cartoon), or (2) creating a new scene with different subject (e.g., dog → cat)
and different background (e.g., living room → beach).
▶ Animals
Source video | “a monkey, playing on the coast, sea” | “a teddy bear, eating apple, green grassland with flowers” |
Source video | “a cute cat, sitting on the ground” | “a labrador dog, sitting on the table” |
Source video | “a husky dog, lying on the sands” | “a sea lion, lying on the ice, winter, snow” |
▶ Objects
Source video | “a red car, moving on the road, autumn, maple leaves” | “a red car, moving on the road, mountain, green grass and trees” |
Source video | “black and white biscuits, falling down sequentially” | “bricks, falling down sequentially” |
Source video | “stools, moving on the railway track” | “white cupcakes, moving on the table” |
▶ Scenes
Source video | “boats and buildings, floating |