← Back to Blog
Creative Tech·Jan 22, 2026·13 min read·York Sims

The Animation Engine: Turning Product Images into Scroll-Driven Websites

engine.py: an automated pipeline using fal.ai Nano Banana 2 + Kling 3.0 + ffmpeg. From a product photo to a full Apple-style launch page — with code.

Apple ships a launch page every year for a new product and every year it looks like a $2 million project. It is not. It is a scroll-driven animation where each frame is a pre-rendered image triggered by scroll position. The technique is 10 years old. The expensive part has always been the frames: you had to shoot a product, render thousands of variations, and stitch them into a sequence. I built an automated pipeline that does the whole thing from a single product photo. I call it the Animation Engine. Here is how it works.

The Stack

  • fal.ai Nano Banana 2 — fast image-to-image model for generating variations from a single source image
  • Kling 3.0 — state-of-the-art image-to-video model that generates smooth motion between keyframes
  • ffmpeg — the Swiss army knife for slicing, reframing, and extracting frames
  • Python — the glue (engine.py, about 600 lines)
  • Scroll animation harness — a small React + Framer Motion component on the frontend

The Pipeline

The engine takes a single product photo and a creative brief. It outputs a folder of 240 numbered frames at 2560×1440, ready to drop into a scroll-driven component. The pipeline has seven stages.

  1. Source prep. Strip the background from the input photo, upscale to 2K, normalize the lighting.
  2. Keyframe generation. Generate 8 keyframes using Nano Banana 2. Each keyframe shows the product in a different angle or state (rotated, exploded, components highlighted).
  3. Motion interpolation. Feed each consecutive pair of keyframes into Kling 3.0 to generate a 30-frame transition video between them.
  4. Video stitching. Concatenate the 7 transition videos into one long video with ffmpeg.
  5. Frame extraction. Extract every frame of the final video as a PNG. ffmpeg again.
  6. Frame cleanup. Run every frame through a consistency pass that fixes color drift (Kling sometimes shifts color over long sequences).
  7. Packaging. Generate a manifest JSON that maps scroll position to frame number.

Why This Works

The reason traditional scroll-driven animations are expensive is that you need physically accurate renders of a real product. You need a 3D model of the product, a renderer, and an artist to pose it. The Animation Engine replaces all of that with image-to-video AI.

The tradeoff is that the result is not physically accurate. It is perceptually accurate. The viewer sees a product rotating smoothly with believable lighting, and their brain fills in the rest. For 90% of product launches this is good enough.

The 10% where this does not work: precision hardware (watches, cameras, medical devices) where every physical detail matters. For that you still need traditional rendering.

The Color Drift Problem

Kling is amazing but it has a bug: over a long sequence, colors drift. The product starts red and by frame 200 it is rust-orange. For a scroll animation this is unacceptable.

The fix is a two-step process. First, I sample the product color from the first frame and store it. Then every frame goes through a color-correction pass that clamps the dominant hue within a tolerance. This is not fancy image processing — it is a 20-line NumPy function. It works because the product is the dominant object in each frame and the background is already transparent.

The Frontend Component

The frontend is surprisingly simple. A scroll listener, a frame index, an image element. On scroll, calculate the scroll progress, map it to a frame number, update the image source. Browser caches the images after the first scroll-through.

Key detail: preload the first 60 frames on page load so the initial scroll feels snappy. Lazy-load the rest as the user scrolls past them. A 240-frame sequence at 150KB per frame is 36MB total. Way too much to ship on page load. Progressive preload makes it feel instant.

The Math on Cost

For a 240-frame sequence:

  • 8 keyframes from Nano Banana 2: $0.40
  • 7 transition videos from Kling 3.0: $14.00
  • Frame extraction and cleanup: free (local)
  • Total per animation: about $14.50

The same animation done traditionally with a 3D artist would cost $3,000 to $8,000 and take a week. The engine ships it in 90 minutes for $14.50. The economics are not close.

Where It Breaks

The engine breaks on four kinds of products.

  1. Text-heavy products. Kling will scramble small text over a long sequence. Not fixable. I exclude any product where text is part of the visual story.
  2. Translucent or refractive materials. Glass, ice, water. The AI models do not understand refraction well and the frames end up dreamy in a bad way.
  3. Human faces. Nano Banana is good with faces but the video model introduces subtle identity drift over 200 frames. People in the product become uncanny. I exclude faces.
  4. Very simple products with symmetric geometry. Spheres, simple cylinders. The AI has trouble distinguishing keyframes and the transitions collapse into visual mush.

What I Would Build Next

Two things are on the roadmap.

First: a web-based version of the engine that non-engineers can use. Upload a product photo, type a brief, wait 90 minutes, get a scroll animation. I am building this on Next.js with a queue worker and a progress UI.

Second: an interpolation layer that uses optical flow between AI keyframes instead of Kling for the transitions. Kling is expensive and sometimes inconsistent. Classical optical flow is free and deterministic. For small motions it should work beautifully.

Receipts

I have shipped 7 product launch pages with the Animation Engine. Total cost across all of them: under $120. Total traditional cost I avoided: probably $30,000. One of the pages has been featured on design showcase sites. None of the viewers have clocked that the animation is AI-generated.

The full engine.py, the React scroll component, the color correction code, and a sample animation output are in the Pro vault. If you are building a product launch page in the next six months, this pipeline will save you more than the Pro membership cost 100 times over.

Pro members get the full breakdown, code repo, templates, and all the receipts that didn't make the post.

Join Pro