The complete guide to creating stunning AI videos with ByteDance's Seedance 2.0 (also known as Seeddance, Seed Dance). Whether you're a beginner or experienced creator, this tutorial covers everything from access to advanced prompt techniques.
Last updated: March 2026
Seedance 2.0 is ByteDance's next-generation professional-grade multi-modal video creation model. It supports images, videos, audio, and text as reference inputs to generate cinematic-quality videos — breaking the limitations of single-material creation. With built-in video editing and extension capabilities, it delivers high-precision reproduction of object details, materials, visual effects, and camera movements while maintaining stable character features across shots.
Unlike traditional text-to-video models, Seedance 2.0 accepts multiple input types simultaneously — combine reference images, existing video clips, audio tracks, and text prompts in a single generation. Use up to 7 reference images with audio and text in one prompt for maximum creative control, giving creators director-level authority over every aspect of the output.
Produces footage so realistic it's indistinguishable from real camera work. Handles complex scenes with ease — from subtle micro-expressions and intense physical action to dynamic song and dance sequences. Features built-in professional camera movements, multi-shot storytelling, and native synchronized audio-video output with multi-language speech support.
Purpose-built for commercial advertising, film production, and social media marketing. With industrial-grade generation quality and significantly higher usable output rates, Seedance 2.0 lowers the cost and barrier of professional video production while streamlining the pipeline from concept to final cut.
Generate videos from any combination of text, images, video clips, and audio references. Combine multiple reference materials in a single prompt for precise creative control.
Upload reference images and bring them to life while preserving visual style, composition, and character details. Ideal for product showcases and storyboard-to-video workflows.
Edit existing videos with AI precision — replace subjects, add or remove objects, and repaint specific areas while maintaining original motion and camera movement.
Extend videos seamlessly in either direction. Forward generation creates preceding content, while track completion fills in gaps for continuous storytelling.
Define the start and end frames of your video and let Seedance 2.0 generate the motion in between. Perfect for precise transitions and controlled narrative sequences.
Native audio-video generation produces rich, matched sound effects alongside visuals. Supports multi-language speech, various accents, and dialect rendering.
Resolution
480P / 720P
Frame Rate
24 fps
Input Types
Text, Image, Video, Audio
Output
Video with synchronized audio
With Vibbit AI, you can use Seedance 2.0 directly from your browser — no installation, no API keys, no complex setup. Simply type your prompt, click generate, and download your video in minutes. Plus, Vibbit offers additional tools like video editing, clip splitting, and multi-platform distribution to help you go from prompt to published content seamlessly.




Reference the character movements and camera language from
@Video 1, generate a fighting scene between
@Image 1 and
@Image 2, with
@Image 3 as the fighting background. The fighting process mimics a pixel game style, with
@Audio 1 as the background music. Include fighting sound effects synchronized with the combat actions.



Reference the woodcut print style of
@Image 1. The camera starts from a close-up detail of the '马上有福' (Good Fortune Horse) bookplate, then slowly pulls back to reveal the full black-and-white woodcut-style horse. As the camera pulls back, the horse begins to stride forward smoothly, with its mane and saddle blanket swaying slightly. The peony patterns in the background become gradually clearer. The entire process maintains the high-contrast black-and-white blocks and dense hatching texture of the woodcut, creating a visual experience of traditional art coming alive in motion. 0-2s: Close-up on the horse's eyes and mane details, high black-and-white contrast with visible dense hatching. 2-5s: Pull-back shot revealing the horse's head and peonies, bold lines and defined blocks. 5-8s: Continue pulling back to show the full horse with wave-shaped border, the horse begins to slowly stride with smooth leg movement animation. Full panorama showing the '马上有福' text, horse continues walking, parallel hatching creates a slight parallax effect, frame gradually darkens to end. Overall camera movement references the pull-back technique of
@Video 1. No text other than '马上有福'.




Use
@Video 1's first-person POV composition throughout, use
@Audio 1 as background music throughout. First-person perspective fruit tea promotional ad for seedance brand 'Peaceful Apple' apple fruit tea limited edition. First frame is
@Image 1, your hand picks a dew-covered Aksu red apple with a crisp collision sound; 2-4s: Quick cut, your hand drops apple chunks into a shaker cup, adds ice and tea base, shakes vigorously, ice collision and shaking sounds synced to upbeat drumbeats; 4-6s: First-person close-up of the finished product, layered fruit tea poured into a clear cup, your hand squeezes cream topping across the top, applies a pink label to the cup, camera zooms in on the layered texture; 6-8s: First-person hand raises cup, you lift the fruit tea from
@Image 2 toward the camera (simulating handing it to the viewer), cup label clearly visible, final frame freezes on
@Image 2. All background voice uses female tone.



Static shot, central fisheye lens peering downward through a circular peephole, referencing the fisheye lens from
@Video 1. A cute kitten wearing red New Year clothes looks up at the camera with a slight smile. The background is a retro corridor with dark yellow walls, black-and-white polka-dot floor tiles, and warm yellow wall sconces. The lens distortion creates a spatial compression effect like a door peephole. Motion references
@Video 2 — the kitten looks at the camera and says 'Happy New Year! Open the door, Seedance is here!'



Reference the rotating camera movement from
@Video 1, generate an upward-looking shot of a skywell from inside a traditional Chinese building. A deep octagonal wooden dome structure covered in intricate woodcarving details, with layered beams and delicate pendant carvings revealing antique grandeur through chiaroscuro contrast. The central opening of the dome frames the clear blue sky and white clouds, with a flock of birds flying across the sky referencing
@Video 2, creating an atmosphere of Chinese aesthetic serenity that connects heaven and earth.



Reference the camera movement from
@Video 1, using
@Image 1 as the first frame to create a tech park concept video. The tall building in the image serves as the visual center, using the same first-person diving perspective to convey the futuristic tech atmosphere of the park in
@Image 1.



8-second intellectual chess-battle anime clip with a revenge theme. 0-3s: The heroine from
@Image 1 turns and sits down, camera cuts, she places a chess piece and says 'You lose,' referencing the storyboard in
@Image 2, her voice referencing the commanding female tone in
@Audio 1, background referencing
@Image 3. 3-4s: Quick pan to a close-up of the man's face from
@Image 4, he says 'How is this possible!' referencing the storyboard in
@Image 5, gritting his teeth in dissatisfaction. 4-6s: Cut to overhead shot, the woman places a chess piece, the onlookers gasp in amazement, referencing the storyboard in
@Image 6. 6-8s: Camera rapidly tilts down, screen fades to black then gradually brightens — in a dimly lit room, the woman gazes out the window at the moonlight and quietly says 'We'll see about that,' referencing the storyboard in
@Image 7.



Fresh cream-style short drama, upbeat guitar rhythm with quick cuts, cream white as the main color with peach pink highlights, soft visuals with no heavy effects — emotions conveyed through expressions. 0-2s: Quick-cut 2 shots — the CEO from
@Image 1 accidentally bumps into the heroine from
@Image 3 wearing the outfit from
@Image 2 (both exchange startled glances) + the CEO pulls off his suit jacket and drapes it over the heroine (hand close-up), background guitar kicks in with gentle sound effects of a coffee cup dropping and fabric rustling. 2-6s: Quick-cut 3 shots — heroine wearing the CEO's jacket looks down with a shy smile (blushing cheek close-up) + CEO watches her walk away with a slight grin, saying 'Let's walk together' referencing
@Audio 1 (profile shot) + the two share a black umbrella on a rainy night, fingertips touch and quickly pull back (close shot), rainy backdrop from
@Image 4, each shot synced to a light drum beat with rain and umbrella sound effects, slight soft-focus haze. 6-8s: Slow-motion of both exchanging smiling glances, text from
@Image 5 appears in the lower right, 'NEW EP DAILY' in small text lower left, faint pink petals drifting in background (minimal), BGM fades to a tender ending note, frame freezes on their side profiles together.



Starting with
@Image 1 as the first frame, the view zooms out to the airplane window. Clouds drift gently into frame, one of them dotted with colorful candy sprinkles, staying centered in the shot. It then slowly morphs into the ice cream from
@Image 2. The camera pulls back inside the cabin, where the character from
@Image 3 sitting by the window reaches out to grab the ice cream through the window, takes a bite, gets cream all over their lips, and breaks into a sweet smile. The voiceover for this video is
@Audio 1.



Replace the perfume in the gift box in
@Video 1 with the face cream from
@Image 1, keeping all motion and camera movement unchanged.



Replace the model in the eyewear e-commerce video
@Video 1 with a Western model, referencing
@Image 1, change the language to English, keeping the person's actions and camera movement unchanged.


8-second video, change the background of
@Video 1 to be surrounded by orange-red pincushion flowers, yellow craspedia, white baby's breath, and green foliage, with half a fresh peach accenting the lower right corner. Soft warm lighting creates a languid, refined atmosphere with rich color layers, delicate details, and a sense of luxurious vintage charm.



Add the guitar-playing man from
@Image 1 to the left rear position of
@Video 1, and add guitar sounds to the background music.


Remove the passerby walking in front of the camera in
@Video 1, keeping the video camera movement and everything else unchanged.



Replace the outfit worn by the model on the runway in
@Video 1 with the outfit from
@Image 1.



Remove the camera crew reflections visible in the mirror and beside the person in
@Video 1, and replace the background audio in
@Video 1 with
@Audio 1.



Paint the exterior walls of the house in
@Video 1 blue, with weather and lighting referencing the snowy conditions in
@Image 1.


Change the shooting angle of
@Video 1 to an overhead shot from behind the two actors, with the lyrics being 'seedance,' keeping the character features unchanged, and the background is an audience-filled arena.



Extend
@Video 1, one-take continuous camera movement, no editing cuts throughout, maximizing the festive New Year atmosphere. Opening with the
@Video 1 scene, naturally transitioning with a slow pull-back shot gliding smoothly through the kitchen door into the living room, where a couple is putting up 'Fu' (fortune) characters by the door. The camera seamlessly pans to the living room window decorated with paper-cut window flowers, then slowly pushes through the window to the outside, fluidly connecting to a scene of children setting off fireworks in the outdoor yard. Throughout, the camera movement is silky smooth and steady with no stuttering, incorporating red lanterns and other New Year elements to build a rich festive atmosphere. Background music references
@Audio 1, background voiceover says: 'Happy New Year, family happiness, good fortune in the Year of the Horse.' The whole piece maintains the visual continuity and immersion of a single continuous take, maximizing both the festive spirit and atmosphere. Human proportions follow real-world physics.



Extend
@Video 1 backward; Duration: 15 seconds; Style: healing and heartwarming, atmosphere-rich, slow motion + soft lighting, warm-tone filter. 3-4s: Wide shot pulling back, a bright art gallery with white exhibition walls covered in oil paintings, soft overhead lighting falling on canvases, visitors walking around and whispering in admiration. 5-7s: Medium tracking shot, the grown-up heroine (graceful temperament) in a simple long dress, gently touching the edge of a canvas, a soft smile in profile, gazing tenderly at her own artwork (close-up). 8-9s: Close shot, heroine receives flowers from a viewer, eyes curving into a grateful smile. 10-15s: Wide overhead shot, heroine standing in the center of the gallery, a wall full of paintings behind her, smiling viewers in front, light bokeh surrounding her, frame freezes. Sound: Background music is
@Audio 1, opening transition has subtle shimmering light particle effects, no dialogue. Lighting/tone: Overall warm white + creamy yellow, gallery lighting is soft with no hard shadows, canvas colors moderately saturated, warm yellow soft glow for the scene, bright and airy for the present moment.



Extend
@Video 1 backward, 11-second video, the car drives smoothly to a desert oasis, background music uses
@Audio 1.



Extend
@Video 1 forward. Core: 1980s Northeast China street life + family warmth, raw and authentic yet tender without being sentimental, building emotional resonance through everyday details and subtle family interactions. Visuals: Warm yellow nostalgic film texture, caramel brown / persimmon red / cream white as main colors, slight film grain + vintage VHS soft glow. Sound: Nostalgic Northeast accordion/suona light instrumental + ambient life sounds as BGM. Subtitles: Simple white Song typeface small text, no effects. Lighting: Northeast old house windowsill warm yellow natural light (70% brightness) + 100W tungsten filament fill light (40% brightness), natural soft shadows, softened highlights, shallow depth of field highlighting characters. Camera: Fixed focal 35-70mm, slow push (0.8x) + gentle handheld follow (1x) + beat-synced slow cuts (2-3s per shot), recreating the texture of life in old Northeast houses. Persistent subtitle: Lower-left tiny text 'The warmth of the Northeast hides in the everyday.' 0-1s [Everyday Opening]: Slow push macro close-up 35mm, beside the old wooden kang table a red Double Happiness enamel mug steams, rough hands pinch an iron spoon stirring rock sugar, alongside a large floral porcelain bowl and tin candy box, background wall pasted with New Year paintings, dried chili strings hanging from ceiling beams, warm light washing over everything. 2-3s [Warm Beat 1]: Gentle handheld follow 50mm, Mom in a blue-and-white checkered apron presses cornmeal cakes by the stove, turning to smile at the child running in wearing a floral padded jacket and ushanka hat, background showing an iron stove and frozen pears and persimmons on the windowsill, sunlight casting warm dappled shadows. 4-5s [Warm Beat 2]: Close-up 50mm, the whole family gathered around the kang table, grandpa sips from an enamel liquor cup and places sauerkraut-and-pork on the child's plate, the child offers a bitten cornmeal cake to grandma, the table laid with stewed sauerkraut noodles and corn porridge, folded floral quilts on the kang. Background music uses
@Audio 1. 6-8s: finally connects to
@Video 1.



Extend
@Video 1 forward, 8-second ultra-HD Year of the Horse red envelope product showcase video. Clean and premium composition, cinematic camera movement, warm gold-red Chinese style color palette, soft diffused lighting, no cluttered backgrounds. 0-2s: Overhead slow push, 3 gold-embossed Year of the Horse red envelopes neatly arranged on a light cream textured paper background, the gold horse motif on the envelopes is finely detailed with natural sheen, gilded patterns reflecting light naturally. 2-5s: Hand gently lifts a red envelope, paper texture is crisp and firm, edges perfectly flat, camera follows with subtle motion, light and shadow shift with movement, accompanied by a delicate metallic chime sound effect. 5-7s: Close-up of the gold embossing details, the horse's lines are fluid, surrounded by auspicious cloud patterns in Chinese style, camera gently racks focus. Finally connects to
@Video 1, background music is
@Audio 1. No people, no extraneous elements, atmosphere maximized, suitable for e-commerce product showcase.


Extend
@Video 1 forward, 12-second sci-fi short film, pure dark cyberpunk sci-fi style, raw industrial texture + cyberpunk neon color clashing, cold blue-gray base + crimson/electric purple highlights, dynamic camera with high-speed push-pull + beat-synced quick cuts + macro close-ups, heavy metal electronic score + mechanical rumbling/energy burst practical sound effects, no subtitles — relying on visual intensity alone for sci-fi impact. 0-3s: High-speed slow push wide angle, desolate interstellar wasteland, a massive damaged mech battle armor wreckage juts from scorched earth at an angle, rust-covered metal surfaces reflecting purple-blue nebula light, ground fissures seeping crimson magma glow, a half-destroyed ring space station floating in the distance. 3-7s: Beat-synced quick-cut 4 shots — macro close-up of a mechanical eye's iris contracting; high-speed tracking of a humanoid cyborg leaping from the mech wreckage; overhead wide angle of the wasteland ground splitting open as giant mechanical tentacles burst from the earth; extreme close-up of the cyborg's eyes igniting combat-red light, mechanical arm deploying an energy blade. 7-12s: High-speed push-pull camera, the cyborg leaps with energy blade and violently collides with the mechanical tentacles, energy burst explosion in blue-cyan + crimson flare, finally connecting to
@Video 1.




The arched window in
@Video 1 opens, entering the interior of an art gallery, connecting to
@Video 2, then the camera moves into the painting, connecting to
@Video 3.




Core requirement: 15-second cinematic short video, 3 core scenes with seamless narrative transitions, following one adorable orange kitten running throughout as the visual thread — from forest wonderland to festive old town to cozy New Year home, finally leaping into a child's arms.
@Video 1 (0-5s), transition: The kitten runs to the edge of the forest onto a stone path, the instant its front paws touch the stone, the camera pushes in close on the paws, the lighting shifts from cool cyan to warm tones, the stone path stretches into the distance, naturally transitioning to
@Video 2 (5-10s). Transition: The kitten runs to the entrance of an old town alley beside a vermilion wooden door, nuzzling a bronze door knocker with its head, camera close-up on the moment the knocker meets the kitten's head, a light ring sounds, the camera follows the kitten's motion pushing the door open and cuts inward, the frame instantly flooded with warm interior light. Connect to
@Video 3 (10-13s). Ending highlight (13-15s): The kitten runs to the child at the center of the frame, camera slow-motion push-in, the kitten leaps, the child smiles and raises a hand, the frame freezes at the instant their fingertips touch.




Duration: 15s; Background music: Soft Chinese-style instrumental, guzheng + pipa gently playing, rhythm gradually building with the visuals, no narration.
@Video 1 lantern light and shadows gradually dissolve,
@Video 2 paper-cut imagery fades in with a gradual transition, the mural and paper-cut horse silhouettes perfectly overlap, connecting to
@Video 3. Detail requirements: Transitions — all element switches use gradual dissolve transitions, no hard cuts, scene connections flow smooth as water with no stuttering. Motion — all animation is slow-paced and gentle, no rapid visuals, matching a premium atmospheric feel.
There are several ways to try Seedance 2.0. Here are the main platforms where you can access the model:
ByteDance's official cloud platform. Access Seedance 2.0 via API for developers and enterprise users. Requires an account and API key.
Use Seedance 2.0 directly in your browser with an intuitive interface. No API setup needed — just write your prompt and generate. The easiest way to get started.
Try Seedance 2.0 on VibbitByteDance's consumer-facing creative platforms also integrate Seedance capabilities for video creation and editing.
Follow these 5 steps to create your first AI video with Seedance 2.0:
Seedance 2.0 supports multiple creation modes. Select the one that matches your goal:
The prompt is the most important input. A good Seedance prompt should include: subject description, action/movement, camera movement, scene/environment, and visual style. Be specific and descriptive for best results.
Example: "A young woman in a flowing red dress walks along a moonlit beach, gentle waves lapping at her feet. Camera slowly dollies forward. Cinematic lighting, film grain, 4K quality."
See our full prompt writing guide with 700+ examples →For multi-modal generation, you can upload reference materials to guide the output:
Configure the generation settings based on your needs:
Click generate and wait for Seedance 2.0 to process your request. Generation typically takes 1-3 minutes depending on complexity. Once complete, preview the result and download in MP4 format. If the result isn't perfect, try adjusting your prompt or reference materials and regenerate.
One of the most common questions: Is Seedance 2.0 free? Here's the breakdown:
Try Seedance 2.0 for free on Vibbit with a generous free tier. No credit card required. Perfect for testing and personal projects.
For professional and high-volume usage, paid plans offer faster generation, priority queue, higher resolution, and API access.
For developers and enterprise users, access Seedance 2.0 via Volcano Engine API or upcoming Vibbit API integration.
View API documentation & code examples →Yes, you can try Seedance 2.0 for free on platforms like Vibbit, which offers free trial credits. The official API through Volcano Engine also has a free tier for developers.
Typically 1-3 minutes per video clip, depending on resolution, duration, and server load. Higher resolution and longer videos take more time.
Yes, image-to-video is one of Seedance 2.0's core features. Upload a reference image and provide a text prompt describing the desired motion and scene.
Yes, most platforms require a free account. On Vibbit, you can sign up in seconds with Google or email and start generating immediately.
Seedance 2.0 supports up to 720P resolution at 24 FPS. Output is in MP4 format with high visual fidelity, realistic motion, and native audio support.
Yes, Seedance 2.0 has native audio-video synchronization. It can generate lip-synced dialogue, sound effects, and background music that match the video content.
Seedance 2.0 is not currently open source. It is available through ByteDance's Volcano Engine API and third-party platforms like Vibbit and CapCut.
ByteDance has integrated Seedance capabilities into its consumer apps including CapCut and Jimeng (即梦). The exact feature availability may vary by region and platform version.
Start using Seedance 2.0 today — no setup required