Please refer to the project page for full-quality and more examples.
Reference Image | Generated Video | Generated Video |
---|---|---|
Reference Image | Generated Video | Reference Image | Generated Video |
---|---|---|---|
A bearded man, wearing a yellow T-shirt, working for a wooden table...
Reference: |
||
---|---|---|
A woman, wearing a white shirt and blue jeans, enjoying her daytime activities...
Reference: |
||
---|---|---|
- Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
- ShowCases
- Overview
- MileStones
- Methods
- Cite Magic Mirror
-
20250101
Paper released! -
202501-202502
We will release code and model (we are working on fit our methods on CogVideoX-1.5, HunyuanVideo, .etc). Stay tuned!
In this work, we presented Magic Mirror, a zero-shot framework for identity-preserved video generation. Magic Mirror incorporates dual facial embeddings and Conditional Adaptive Normalization (CAN) into DiT-based architectures. Our approach enables robust identity preservation and stable training convergence. Extensive experiments demonstrate that Magic Mirror generates high-quality personalized videos while maintaining identity consistency from a single reference image, outperforming existing methods across multiple benchmarks and human evaluations.
If you find this repo useful for your research, please consider citing the paper
@misc{zhang2025magicmirror,
title={Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers},
author={Yuechen Zhang and Yaoyang Liu and Bin Xia and Bohao Peng and Zexin Yan and Eric Lo and Jiaya Jia},
journal={arXiv preprint arXiv:2501.03931},
year={2025}
}