AI-Powered Video Creation Platform
🚀 Sign up to claim 100 free credits
🚀 Sign up to claim 100 free credits
Use images, videos, and audio as multi-modal references to generate new videos with precise control over style, content and composition

Use first-person POV composition throughout @Video 1, with @Audio 1 as background music throughout. First-person perspective fruit tea promotional ad for Seedance brand "Peace & Apple" limited edition apple fruit tea; opening frame is @Image 1, your hand picks a morning dew-covered Aksu red apple, crisp apple collision sound; 2-4 seconds: quick cuts, your hand drops apple chunks into a shaker cup, adds ice and tea base, shakes vigorously, ice clinking and shaking sounds sync with upbeat drum beats, background voiceover: "Fresh-cut and hand-shaken"; 4-6 seconds: first-person close-up of the finished product, layered fruit tea poured into a transparent cup, your hand gently squeezes cream cap to spread across the top, a pink-red label is applied to the cup body, camera zooms in to show the layered texture of the cream cap and fruit tea; 6-8 seconds: first-person handheld raising the cup, you lift the fruit tea from @Image 2 toward the camera (simulating the perspective of handing it to the viewer), the cup label is clearly visible, background voiceover "Take a fresh sip", ending frame freezes on @Image 2. All background voices should be in a female tone.

Reference the camera movement in @Video 1, use @Image 1 as the opening frame to create a concept video for a technology park, with the tall building in the image as the visual center, using the same first-person diving perspective to showcase the tech atmosphere of the park shown in @Image 1.

First-person kitchen baking vlog, slight wide-angle, immersive handheld feel, no face visible throughout, only hands and arms shown, perspective referencing @Image 1. Shot 1: overhead angle, both hands open a recipe book on the kitchen island, book page content is @Image 2, morning light streaming in, slight camera shake for documentary feel. Shot 2: camera pointing straight down, both hands kneading, pressing, and folding dough on a flour-dusted surface, flour scattering, close-up texture details. Shot 3: both hands hold up the prepared raw apple pie toward the camera for display, apple pie referencing @Image 3. Shot 4: black screen with subtitle countdown, "200°C 30mins". Shot 5: return to first-person perspective, both hands carry the freshly baked steaming apple pie, golden crispy crust, warm light atmosphere.

Starting with @Image 1 as the opening frame, the camera zooms out through an airplane window, clouds drift slowly into the frame one after another, one of which is a cloud dotted with colorful candy beads, always centered in the frame, then slowly morphs into the ice cream from @Image 2, the camera pulls back into the cabin interior, the character from @Image 3 sitting by the window reaches out and takes the ice cream from outside the window, takes a bite, mouth covered in cream, face beaming with a sweet smile, at this point the video dubbing is @Audio 1.

Reference the character actions and camera language from @Video 1, generate a fight scene between the characters in @Image 1 and @Image 2, with the fight background being @Image 3, the fighting process mimics the Contra pixel game style, background music is from @Audio 1, accompanied by fighting sound effects synchronized with the combat actions.
Reference the rotating camera movement in @Video 1, generate a skywell viewed from inside a Chinese ancient building looking upward. A deep octagonal wooden dome structure filled with intricate wood carving details, layered beams and delicate hanging flower carvings showcase an ancient, heavy, and textured quality in the interplay of light and shadow. The opening at the center of the dome frames a clear blue sky with white clouds, a flock of birds flies across the sky, referencing @Video 2, creating a Chinese aesthetic atmosphere that connects heaven and earth with a serene, distant tranquility.
Extend videos forward or backward in time, connect multiple clips, and fill in missing sequences with coherent storytelling
Continue the story from @Video 1, showing what happens next. If there are spoken lines, the characters should speak Japanese with corresponding subtitles displayed at the bottom.
Duration: 15s; Background music: gentle Chinese traditional instrumental, guzheng + pipa playing softly, rhythm gradually rising with the visuals, no narration; @Video 1 lantern light and shadows gradually dissolve, @Video 2 paper-cut imagery fades in, the silhouette of the mural and paper-cut horse outline perfectly overlap, then cut to @Video 3. Detail requirements: Transitions: all element changes use dissolve/fusion transitions, no hard cuts, visuals flow seamlessly like water with no stuttering; Motion dynamics: all movements are slow-paced and gentle, no fast footage, matching a premium atmospheric feel.
Using @Video 1 as the opening and @Video 2 as the ending, generate the full process of this child drawing a dinosaur, multiple shots/storyboards are acceptable.
Extend @Video 1 forward, 12-second sci-fi short film in authentic "Love, Death & Robots" dark cyberpunk sci-fi style, gritty industrial texture + cyberpunk neon color clash, cool-toned blue-gray base + scarlet/electric purple highlights, dynamic camera with high-speed push-pull + beat-synced quick cuts + macro close-ups, heavy metal electronic music + mechanical roaring/energy blast original sound effects, no subtitles relying purely on visual tension to create sci-fi impact, 3D modeling with sharp angular edges + fine textures, strong light-dark contrast, blending wasteland sci-fi, mechanical punk, and Lovecraftian sense of the unknown, suitable for sci-fi animation short trailers / sci-fi content promotion.
Extend @Video 1, 10-second one-take continuous shot, no editing cuts throughout, max out the festive New Year atmosphere; opening with @Video 1 footage, naturally transitioning as a slow pull-back camera smoothly passes through the kitchen door, seamlessly moving into the living room where a couple is putting up Fu character decorations at the doorway, the camera seamlessly pans to the living room window where window paper cuttings are being applied, then a slow push camera moves outward through the window, smoothly connecting to children setting off fireworks on the outdoor open ground; the entire camera movement is silky smooth and continuous, with even speed and no stuttering, the footage incorporates red lanterns and other New Year elements to enhance the strong festive atmosphere; background music references @Audio 1, background voiceover: "Happy New Year, happiness to the whole family, auspicious Year of the Horse", ensuring the overall visual continuity and immersion of the one-take shot, maximizing both the New Year flavor and atmospheric feel, character proportions should follow real-world physics.
Extend @Video 1 backward; Duration: 15 seconds; Style: heartwarming and tender, max atmosphere, slow motion + soft lighting, warm-toned filter; 3-4 seconds: wide pull-back, bright art gallery, white exhibition walls covered with oil paintings, soft overhead light spilling onto the canvases, visitors in the gallery quietly admiring and murmuring appreciation. 5-7 seconds: medium shot tracking, the grown-up female lead (gentle temperament) wearing a simple long dress, reaches up to gently touch the edge of a canvas, slight smile on her profile, gazing tenderly at her own work (close-up). 8-9 seconds: close-up, the female lead receives flowers from an audience member, eyes curving into a smile as she thanks them. 10-15 seconds: overhead wide shot, the female lead stands in the center of the gallery, walls of paintings behind her, smiling audience in front, light spots surrounding the scene, frame freezes. Sound: background music is @Audio 1, with subtle light sparkles sound effect during the opening transition, no dialogue; Lighting/Color tone: overall warm white + cream yellow tones, soft gallery lighting with no harsh shadows, moderate color saturation on the canvases, memory shots in warm yellow soft light, present-day shots bright and airy.
Edit existing videos by replacing elements, modifying backgrounds, adjusting styles, and applying targeted transformations

Repaint the exterior walls of the house in @Video 1 to blue, with weather and lighting referencing the snowy scene in @Image 1.
8-second video, change the background of @Video 1 to be surrounded by orange-red pincushion flowers, yellow golden ball flowers, white baby's breath, and green foliage, with half a fresh peach in the lower right corner, soft warm lighting creating a languid yet refined atmosphere, rich color layers, fine details, full of premium quality and vintage charm.

Replace all dog food packaging bags appearing in @Video 1 uniformly with the new packaging design shown in @Image 1. Automatically detect and track every dog food bag in each frame of the video (including foreground, midground, background, different angles, motion states, and occlusion states), and perform a complete replacement. The new packaging must strictly match the original position, size, perspective angle, lighting direction, motion trajectory, and occlusion relationships, achieving natural fit and realistic blending. Only replace the packaging appearance (pattern/color/brand/shape), keeping all original actions, people, environment, camera movement, rhythm, and all other elements completely unchanged. Do not change the background, composition, lighting, depth of field, camera movement, or add/remove any objects. The overall effect must be realistic and natural, with no visible flaws, no drift, no flickering, and no style changes.
@Video 1 is an anime clip from a Japanese fireworks festival, please restore and correct the video's color.
Reference the camera in @Video 1, replicate the identical exterior design, size proportions, colors, and materials to generate the exact same but brand-new camera. The body is clean and smooth, without any scratches, wear marks, or signs of use. Apart from making the body appear new, do not change the lens position, composition, lighting, background, or shooting angle.

Replace the cat in @Video 1 with the lion from @Image 1, lying on its side on the girl's lap, gently interacting with the girl.