Talking-Head Footage
→ Viral 9:16
Automatically.
A Python CLI that converts raw video into optimized vertical clips — face tracking, punch zoom, word-level captions, and loudnorm audio, all from a single command.
One Command.
Six Stages.
Every step is deterministic, testable in isolation, and configurable via CLI flags.
Built for Reliability.
Automatically centres the 9:16 crop on the detected face. Falls back to centre crop when no face is present.
Adds slow-zoom punch-in effects at strategic moments to increase retention without looking artificial.
Word-level timestamps from Whisper drive 3–5 word caption groups with an active-word highlight that tracks speech.
WebRTC VAD detects and cuts silent sections. --silence-gap flag controls the minimum gap to cut.
EBU R128 loudness normalisation applied per-segment and on the final output. Consistent audio across all clips.
Unit tests cover every module. Automated CI runs on every push. Zero suppressed warnings.
Built on Proven Tools.
V2 Complete.
All V2 modules are shipped. V3 (smart tracking, AI-guided cuts) is on the roadmap.
Contact
Questions, feedback, or integration requests — reach out directly.
contact@lpagesapplabs.com