FAQ: Character ID Drift and Subtitle Optimization
FAQ: Character ID Drift
Typical Symptoms
The generated character's appearance doesn't match the reference image, or a "face swap" (ID drift) occurs mid-video, causing the character in the video to resemble a celebrity and get rejected by moderation.
Root Cause Analysis
1. Ineffective Face Reference Images
- Mixed reference images: Merging the face reference image with full-body/half-body pose images, clothing reference images, detail images, etc. into a single picture
- Face area too small: The face occupies too small a proportion of the overall image, so the model doesn't assign enough weight to it during feature extraction
2. Using Multi-View Images of the Character
Multi-view material contains different angles of the same person, making it easy for the model to identify them as multiple distinct subjects, which actually worsens ID drift.
Solutions
Strengthen the Independence and Weight of the Face Reference
- Prepare a close-up headshot: Add an extra image that contains only the character's head (close-up headshot, face only, ideally with a neutral expression; minimize distracting elements like shoulders, neck, and background)
- Define the subject clearly in the prompt:
` `
- Place important material first: Position material that requires more precise reference closer to the beginning of the prompt
Comparison Results
Before optimization: character undergoes a "face swap" mid-video, resembling a celebrity
Video comparison:
- Before:
- After:
Recommended Approach
- Use close-up headshot + full-body shot as character references
- Avoid using multi-view images of the character
FAQ: Unwanted Subtitles in the Video
Typical Symptoms
The prompt does not request subtitles, but the generated video contains them anyway.
Solutions
Currently, there is no way to 100% prevent subtitle generation. The following methods can only reduce the probability of occurrence and improve the "success rate":
Method 1: Add Constraint Words
Add explicit constraint instructions to the prompt:
Keep it subtitle-free
Avoid generating any text or subtitles
Method 2: Clean Up Reference Material
If the text in the reference image/video is not essential information, first remove the text using external tools (e.g., the image/video editing capabilities of Seedream/Seedance models), then use the text-free material as input.
Comparison Results
Before optimization (top): unwanted subtitles appear in the video After optimization (bottom): no subtitles
Method 3: Switch to Landscape
If the use case allows, prioritize generating video in landscape dimensions (landscape has a significantly lower probability of generating subtitles than portrait). You can crop it to portrait later using editing software.
🍅 Try AI Video Generation Free on Tomato AI
Sign up for free credits. Access Seeddance 2.0, Sora 2, Kling 3 & more top models. No watermark, 1080P output.
Start Creating Free →