ByteDance has introduced DreamActor-M1, an AI animation model that can transform a single image into a video with fluid movements. The model uses a DiT model and hybrid guidance to capture facial expressions, control head movement, and maintain body motion. Despite its impressive performance, it has drawbacks like not supporting physical contact with objects and potential ethical issues.
ByteDance launches DreamActor-M1: AI that turns your photos into full-body films
ByteDance has unveiled DreamActor-M1, a new AI animation model that is among the most sophisticated human image animation frameworks we have yet to see. In contrast to other algorithms that struggle with lengthy films or lose facial expression detail, this one appears to have figured it out. A single image can be animated into a lengthy video with remarkably fluid movements, such as a smile, a head turn, or even intricate dance moves.
DreamActor-M1, which was introduced on April 2, 2025, aims to revolutionize the usage of human animation in digital content by producers, game developers, filmmakers, and even ad makers. The main attraction? Even when you start with a single static image, it maintains realistic body positions, emotive faces, and a seamless finished video.
Also Read: ByteDance and Broadcom Collaborate on New AI Chip to Navigate US-China Tech Disputes
A DiT (Diffusion Transformer) model, supported by what the team refers to as “hybrid guidance,” is at the core of DreamActor-M1. This is simply a fancy way of saying that it uses a variety of technologies to deal with the challenging aspects of human animation. It attempts to achieve three main goals:
The researchers claim that this configuration makes the animation process “more expressive and identity-preserving,” meaning that it can maintain the subject’s appearance as they are moving about.
Also Read: OpenAI intends to publish a new “open” AI language model in the coming months
The majority of image animation technologies are only effective on faces. That restriction is broken by DreamActor-M1. The model adjusts its look whether it’s a full-body dance video or a close-up portrait. Thousands of films in a range of scales and situations—from speeches to sports—were used by the team to train it.
For instance, if you give it a photo and a dancing video, the system can make the person taking the selfie mimic the dance motions, including how their garments flow and how their feet move in time.
Maintaining consistency throughout longer videos is one of the most difficult aspects of this field. You don’t want your attire or facial features to change at random in the middle of the game. With the help of a sophisticated algorithm, ByteDance’s model creates additional frames from the original image to cover perspectives that weren’t included in the input.
The system uses these “pseudo reference” frames to fill in the spaces, such as how a person’s back appears when they turn around, and to seamlessly connect everything.
Also Read: OpenAI introduces image generation for ChatGPT users powered by GPT-4o
In practically every criterion, DreamActor-M1 performs better than other leading models like Animate Anyone, Champ, and MimicMotion:
Method | FID ↓ | SSIM ↑ | PSNR ↑ | LPIPS ↓ | FVD ↓ |
Animate Anyone | 36.72 | 0.791 | 21.74 | 0.266 | 158.3 |
MimicMotion | 35.90 | 0.799 | 22.25 | 0.253 | 149.9 |
DisPose | 33.01 | 0.804 | 21.99 | 0.248 | 144.7 |
DreamActor-M1 | 27.27 | 0.821 | 23.93 | 0.206 | 122.0 |
According to the DreamActor-M1 research paper, these figures demonstrate that the new model produces more visually appealing frames and films with greater coherence.
ByteDance acknowledges that the model still has certain drawbacks. Physical contact with objects in the video is not yet supported, and it has trouble with dynamic camera movement. Sometimes bone correction goes wrong and needs to be fixed by hand.
Since this type of technology can be utilized to produce deepfakes, ethical issues are also present. According to the team, they are limiting access to the core model and will remove any content that has been marked as potentially offensive.
DreamActor-M1 is undoubtedly a significant advancement in AI animation. This might become the new norm, whether it’s used to power game avatars or assist content makers in bringing still photos to life. It will be interesting to observe how other companies, such as OpenAI, Runway, or Meta, react to ByteDance’s subtly raised standard.
This post was last modified on April 4, 2025 6:33 am
The top 11 generative AI companies in the world are listed below. These companies have…
Google has integrated Veo 2 video generation into the Gemini app for Advanced subscribers, enabling…
Perplexity's iOS app now makes its conversational AI voice assistant compatible with Apple devices, enabling…
Bhavish Aggarwal is in talks to raise $300 million for his AI company, Krutrim AI…
The Beijing Humanoid Robot Innovation Center won the Yizhuang Half-Marathon with the "Tiangong Ultra," a…
Cursor AI Code Editor is more than just a coding tool; it’s a comprehensive assistant…