add poem-storyboard
This commit is contained in:
parent
acb9330354
commit
e3c6408802
@ -1,6 +1,6 @@
|
||||
---
|
||||
name: poem-storyboard
|
||||
description: 古诗词 AI 视频分镜脚本生成器。输入一首古诗/词,产出秒级分镜表:每句诗对应起止秒数、画面细节中文描写、运镜方式、英文关键帧图片提示词、英文视频运动提示词。产出可直接交给 agnes-image 生成关键帧图片与分镜视频。适用:古诗动画、诗词 MV、国风短视频、AI 视频分镜、storyboard、关键帧脚本、shot list、诗词可视化、古诗配画配视频。
|
||||
description: 古诗词 AI 视频分镜脚本生成器。输入一首古诗/词,产出秒级分镜表:每句诗对应起止秒数、画面细节中文描写、运镜方式、英文关键帧图片提示词、英文视频运动提示词。采用「链式共享关键帧」让相邻镜头首尾帧复用同一张图,镜头之间无缝衔接。产出可直接交给 agnes-image 生成关键帧图片与分镜视频。适用:古诗动画、诗词 MV、国风短视频、AI 视频分镜、storyboard、关键帧脚本、shot list、诗词可视化、古诗配画配视频。
|
||||
category: Creative Generation
|
||||
---
|
||||
|
||||
@ -8,13 +8,35 @@ category: Creative Generation
|
||||
|
||||
把一首**古诗词**拆解成可直接用于 AI 视频生产的**秒级分镜脚本**:每一句诗对应一个镜头,含起止秒数、画面细节中文描写、运镜方式、英文**关键帧图片提示词**、英文**视频运动提示词**。
|
||||
|
||||
本 Skill **只负责产出分镜脚本**。脚本里每个镜头的图片/视频提示词都已写成可直接执行的形式,后续由 agent 调用 [agnes-image](../../linggan/agnes-image/SKILL.md) 出图、出视频。
|
||||
本 Skill 的核心是**「链式共享关键帧」**:把整片当成一条连续的画面长卷,N 句诗对应 **N+1 个关键帧节点**,第 i 个节点**同时**是「镜头 i 的尾帧」和「镜头 i+1 的首帧」——**同一张图、同一个 URL 被两个镜头复用**。这样上一镜视频的最后一帧 = 下一镜视频的第一帧,拼接时零跳变、严丝合缝。
|
||||
|
||||
本 Skill **只负责产出分镜脚本**。脚本里每个关键帧/视频提示词都已写成可直接执行的形式,后续由 agent 调用 [agnes-image](../../linggan/agnes-image/SKILL.md) 出图、出视频。
|
||||
|
||||
## 为什么用这个 Skill?
|
||||
|
||||
- **秒级时间轴**:不是泛泛"配个图",而是明确"第 0–5 秒展示什么、镜头怎么动",可直接对齐剪辑时间线。
|
||||
- **一句一镜**:以诗句为最小分镜单位,逐句给画面、运镜、关键帧、转场。
|
||||
- **可直接出图出视频**:每个镜头都给好了英文图片提示词与视频运动提示词,agent 拿来就能喂给 agnes-image。
|
||||
- **镜头无缝衔接**:链式共享关键帧 + 链式图生图,保证相邻镜头画面连续、风格人物一致,不再"每镜各画各的、拼起来对不上"。
|
||||
- **可直接出图出视频**:每个关键帧、每个镜头都给好了英文提示词,agent 拿来就能喂给 agnes-image。
|
||||
|
||||
## 核心概念:链式共享关键帧
|
||||
|
||||
```
|
||||
诗句: 句1 句2 句3 句4
|
||||
镜头: [ shot1 ] [ shot2 ] [ shot3 ] [ shot4 ]
|
||||
关键帧: k0 ──────k1───────k2───────k3───────k4
|
||||
↑首帧 ↑尾帧/首帧 ↑尾帧/首帧 ↑尾帧/首帧 ↑尾帧
|
||||
|
||||
· N 句诗 → N+1 个关键帧节点 k0..kN
|
||||
· shot i 走 agnes「关键帧动画」,从 k(i-1) 过渡到 k(i)
|
||||
· k(i) 既是 shot i 的尾帧,也是 shot (i+1) 的首帧 —— 同一张图复用
|
||||
· 每个 k(i) 生成时用 k(i-1) 做图生图参考(ref),保证场景/角色/风格一致
|
||||
```
|
||||
|
||||
要点:
|
||||
- **每镜都用 2 帧**(首帧 + 尾帧),统一走关键帧动画,画面有明确的"从哪到哪"。
|
||||
- **相邻镜头共享中间帧**:不要为同一个时间点分别画两张图。
|
||||
- **链式图生图**:k0 用文生图打底,k1..kN 都拿前一帧的 URL 做 `--image` 参考,只让光线、姿态、镜头位置、季节时辰渐变,主体不漂移。
|
||||
|
||||
## 工作流
|
||||
|
||||
@ -22,10 +44,12 @@ category: Creative Generation
|
||||
古诗原文
|
||||
→ ① 解析(断句 / 意象 / 情绪 / 季节时辰)
|
||||
→ ② 秒级分镜(每句诗 = 一个 shot,分配 start/end 秒数)
|
||||
→ ③ 细节描写(中文画面 + 运镜 + 转场)
|
||||
→ ④ 关键帧提示词(英文,给 agnes-image 文生图)
|
||||
→ ⑤ 视频运动提示词(英文,给 agnes-image 图生视频)
|
||||
→ ⑥ 输出:可读分镜表(Markdown) + storyboard.json
|
||||
→ ③ 规划关键帧链(N 句 → N+1 个节点 k0..kN,定每个节点画什么)
|
||||
→ ④ 细节描写(中文画面 + 运镜 + 转场)
|
||||
→ ⑤ 关键帧提示词(英文,给 agnes-image 链式文生图/图生图)
|
||||
→ ⑥ 视频运动提示词(英文,给 agnes-image 关键帧动画)
|
||||
→ ⑦ 输出:可读分镜表(Markdown) + storyboard.json
|
||||
→ ⑧ 出图出视频 → merge_videos.py 合并成片 final.mp4
|
||||
```
|
||||
|
||||
## 全局参数(先确认/默认,再逐句分镜)
|
||||
@ -43,19 +67,31 @@ category: Creative Generation
|
||||
- 整片估算:五言绝句(4 句)≈ 16–20 秒;七言绝句 ≈ 20–24 秒;律诗(8 句)≈ 35–45 秒。
|
||||
- 各镜头在时间轴上**首尾相接不留缝**,转场(溶解/推拉)写在 `transition` 字段,靠剪辑时叠加。
|
||||
|
||||
## 规划关键帧链(关键步骤)
|
||||
|
||||
在逐句细节描写前,先把整片的**关键帧节点序列 k0..kN** 想清楚——它是整片画面的"骨架":
|
||||
|
||||
1. **k0(开篇锚点)**:全片第一帧,定调画风、主体、季节时辰。用文生图生成。
|
||||
2. **k1..kN(过渡节点)**:每个节点是"一句诗结束、下一句诗开始"的那个共享画面。规划时问自己:上一句的情绪/画面,如何**平滑过渡**到下一句?这一帧应是两句之间的视觉枢纽。
|
||||
3. **一致性约束**:相邻节点必须是**同一场景/角色的渐变**——只改光线、姿态、镜头位置、近远景,不要换人换景。需要切换大场景时,把它放在 `transition` 为 `match cut` / `fade to black` 的镜头边界上,并在该尾帧 prompt 里写清新场景的承接元素。
|
||||
4. **每个节点记下 `ref`**:生成该帧时引用哪一帧做图生图参考(一般就是前一个节点)。k0 的 `ref` 为 `null`(纯文生图)。
|
||||
|
||||
## 每个分镜(shot)必须写清楚
|
||||
|
||||
1. **line**:诗句原文。
|
||||
2. **start / end**:在整片时间轴上的起止秒(连续不留缝)。
|
||||
3. **scene**:中文画面细节描写——主体、环境、光线、色调、季节时辰、情绪,具体到"能照着画"。
|
||||
4. **camera**:运镜(如 slow dolly in / pan left / crane up / tilt down / static)。
|
||||
5. **keyframes**:1–2 个关键帧:
|
||||
- **1 帧** → 该镜头走 agnes "图生视频"(单图动起来)。
|
||||
- **2 帧**(`start` + `end`)→ agnes "关键帧动画",镜头从首帧过渡到尾帧。
|
||||
- 每帧含英文 `image_prompt`,**只写静止画面内容,不要带运镜词**(运镜交给视频)。
|
||||
- 两帧要保持同一场景/角色,仅光线、姿态、镜头位置渐变,保证过渡自然。
|
||||
6. **motion**:英文视频运动提示词——描述这几秒画面如何动(云雾流动、衣袂飘、水波、镜头推进),强调保持主体一致。
|
||||
7. **transition**:到下一镜的转场(如 `dissolve`、`fade`、`fade to black`、`match cut`)。
|
||||
5. **from_frame / to_frame**:本镜引用的两个关键帧节点 id(如 `k0` → `k1`)。`to_frame` 必须等于下一镜的 `from_frame`(共享)。
|
||||
6. **motion**:英文视频运动提示词——描述这几秒画面如何从首帧动到尾帧(云雾流动、衣袂飘、水波、镜头推进),强调保持主体一致。
|
||||
7. **transition**:到下一镜的转场(如 `dissolve`、`fade`、`fade to black`、`match cut`)。因为首尾帧已共享,多数衔接用 `cut`/`dissolve` 即可天然连贯。
|
||||
|
||||
## 关键帧节点(frame)必须写清楚
|
||||
|
||||
1. **id**:节点编号 `k0`、`k1` …。
|
||||
2. **image_prompt**:英文,**只写静止画面内容,不要带运镜词**(运镜交给视频)。
|
||||
3. **ref**:生成本帧时图生图参考的节点 id(保持一致性);`k0` 为 `null`。
|
||||
4. 相邻节点的 `image_prompt` 应是**同一画面的渐变**,仅光线/姿态/镜头位置不同,措辞上显式复用相同主体、服饰、环境描述。
|
||||
|
||||
## 输出格式(同时给两份)
|
||||
|
||||
@ -63,16 +99,38 @@ category: Creative Generation
|
||||
|
||||
| # | 时间 | 诗句 | 画面描写 | 运镜 | 关键帧 | 转场 |
|
||||
|---|------|------|----------|------|--------|------|
|
||||
| 1 | 0–5s | 床前明月光 | 夜深,简朴卧房,月光透过窗棂洒在地上如霜… | slow dolly in | 1 帧 | dissolve |
|
||||
| 1 | 0–5s | 床前明月光 | 夜深,简朴卧房,月光透过窗棂洒在地上如霜… | slow dolly in | k0 → k1 | dissolve |
|
||||
| 2 | 5–10s | 疑是地上霜 | 同一卧房,视线移向地面霜白光斑… | tilt down | k1 → k2 | dissolve |
|
||||
|
||||
> 关键帧列写成 `k(i-1) → k(i)`,能一眼看出相邻镜头共享了哪一帧(上一行的右值 = 下一行的左值)。
|
||||
|
||||
### B. storyboard.json(给 agnes-image 用)—— 严格 JSON
|
||||
|
||||
顶层用全局 `frames` 数组承载链式关键帧,`shots` 只引用帧 id,**共享关系一目了然**:
|
||||
|
||||
```json
|
||||
{
|
||||
"title": "静夜思",
|
||||
"author": "李白",
|
||||
"style": "Chinese ink wash painting (shuimo), cinematic, soft mist, elegant, film grain",
|
||||
"aspect": "1152x768",
|
||||
"frames": [
|
||||
{
|
||||
"id": "k0",
|
||||
"ref": null,
|
||||
"image_prompt": "Quiet ancient Chinese bedroom at midnight, cold moonlight streaming through a latticed window onto the floor like frost, minimal wooden furniture, plain hanging drape, serene"
|
||||
},
|
||||
{
|
||||
"id": "k1",
|
||||
"ref": "k0",
|
||||
"image_prompt": "Same bedroom and same latticed window, moonlight pooled brighter on the floor, frost-like glow intensified on the ground, deep silent night, identical furniture and drape"
|
||||
},
|
||||
{
|
||||
"id": "k2",
|
||||
"ref": "k1",
|
||||
"image_prompt": "Same bedroom, camera lowered toward the bright frost-like moonlight patch on the floor, glowing pale silver, same window and furniture in soft focus background"
|
||||
}
|
||||
],
|
||||
"shots": [
|
||||
{
|
||||
"id": 1,
|
||||
@ -81,32 +139,102 @@ category: Creative Generation
|
||||
"end": 5,
|
||||
"scene": "夜深人静,简朴卧房内,清冷月光透过窗棂洒在地面,泛着霜白色泽,墙上挂着素色帷幔。",
|
||||
"camera": "slow dolly in",
|
||||
"keyframes": [
|
||||
{ "role": "start", "image_prompt": "Quiet ancient Chinese bedroom at midnight, cold moonlight streaming through a latticed window onto the floor like frost, minimal furniture, serene" },
|
||||
{ "role": "end", "image_prompt": "Same bedroom, moonlight pooled brighter on the floor, frost-like glow intensified, deep silent night" }
|
||||
],
|
||||
"motion": "Slow gentle camera push-in, moonlight subtly shimmering on the floor, faint dust motes drifting, calm and still",
|
||||
"from_frame": "k0",
|
||||
"to_frame": "k1",
|
||||
"motion": "Slow gentle camera push-in from k0 to k1, moonlight subtly shimmering on the floor, faint dust motes drifting, calm and still, keep room and window identical",
|
||||
"transition": "dissolve"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"line": "疑是地上霜",
|
||||
"start": 5,
|
||||
"end": 10,
|
||||
"scene": "同一卧房,视线下移,地面霜白光斑愈发明亮,宛如薄霜铺地。",
|
||||
"camera": "tilt down",
|
||||
"from_frame": "k1",
|
||||
"to_frame": "k2",
|
||||
"motion": "Camera slowly tilts down from k1 to k2 toward the frost-like moonlight on the floor, glow gently intensifying, same room, smooth and quiet",
|
||||
"transition": "dissolve"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
> 提示词约定:`image_prompt` 用**英文**、只描述静止画面;`motion` 用**英文**、只描述这几秒怎么动。生成图片时把全局 `style` 拼到每条 `image_prompt` 后。
|
||||
> 约定:
|
||||
> - `frames` 按 `k0..kN` 顺序排列且**链式依赖**(`ref` 指向前一帧),数量 = 镜头数 + 1。
|
||||
> - `shots[i].to_frame == shots[i+1].from_frame`(首尾共享,**严格相等**)。
|
||||
> - `image_prompt` / `motion` 用**英文**;`image_prompt` 只描述静止画面,`motion` 只描述怎么动。
|
||||
> - 生成图片时把全局 `style` 拼到每条 `image_prompt` 后。
|
||||
|
||||
## 后续交给 agnes-image(agent 自动执行)
|
||||
|
||||
产出 storyboard.json 后,按每个 shot 顺序处理(详见 agnes-image Skill):
|
||||
产出 storyboard.json 后,**先把整条关键帧链全部出图(拿到每帧公网 URL),再逐镜头出视频**:
|
||||
|
||||
1. **出关键帧图片**:对每个 `keyframes[*].image_prompt`(拼上全局 `style`),调用 agnes-image 文生图,`--size` 用 `aspect`,并 `--save` 到本地。文生图返回的 `MEDIA_URL` 已是**公网 URL**。
|
||||
2. **出分镜视频**:用上一步的关键帧 `MEDIA_URL` 作为 `--image`,配 `motion` 提示词调用 agnes-image 图生视频;`--duration` 取该镜头 `end - start`;当有 2 个关键帧时加 `--keyframes` 走关键帧动画。
|
||||
3. **拼接**:各镜头视频按 storyboard 的 `start/end` 时间轴在剪辑软件里顺序拼接,BGM/字幕另行叠加。
|
||||
### 第一步:链式生成全部关键帧(按 frames 顺序,不可乱序)
|
||||
|
||||
> 关键:agnes-image 的视频参考图**必须是公网 URL**(不支持本地/Base64)。直接用文生图返回的 `MEDIA_URL` 即可,无需手动上传图床。
|
||||
维护一个 `frame_url[id]` 映射,依次处理 `frames`:
|
||||
|
||||
1. **k0(ref 为 null)**:文生图。
|
||||
```bash
|
||||
python {agnes}/scripts/generate_image.py \
|
||||
--prompt "<k0.image_prompt> , <style>" \
|
||||
--size "<aspect>" --save ./outputs/k0.png
|
||||
```
|
||||
记下输出的 `MEDIA_URL` 存入 `frame_url["k0"]`。
|
||||
2. **k1..kN(有 ref)**:**图生图**,用 `ref` 帧的公网 URL 做参考,保证与上一帧连贯一致。
|
||||
```bash
|
||||
python {agnes}/scripts/generate_image.py \
|
||||
--prompt "<ki.image_prompt> , <style>" \
|
||||
--image "<frame_url[ki.ref]>" \
|
||||
--size "<aspect>" --save ./outputs/ki.png
|
||||
```
|
||||
记下 `MEDIA_URL` 存入 `frame_url["ki"]`。
|
||||
3. 全部出完后,`frame_url` 里就是 N+1 个公网 URL。
|
||||
|
||||
> 链式图生图是镜头能衔接的关键:每帧都"长在"前一帧身上,整条链风格/人物/构图一致。
|
||||
|
||||
### 第二步:逐镜头出视频(关键帧动画,复用共享 URL)
|
||||
|
||||
对每个 shot,取 `from_frame`、`to_frame` 两个帧的公网 URL,跑关键帧动画:
|
||||
|
||||
```bash
|
||||
python {agnes}/scripts/generate_video.py \
|
||||
--prompt "<shot.motion>" \
|
||||
--image "<frame_url[shot.from_frame]>" \
|
||||
--image "<frame_url[shot.to_frame]>" \
|
||||
--keyframes \
|
||||
--duration <shot.end - shot.start> \
|
||||
--width <aspect 宽> --height <aspect 高> \
|
||||
--save ./outputs/shot_<id>.mp4
|
||||
```
|
||||
|
||||
因为 `to_frame` 与下一镜 `from_frame` 是**同一个 URL**,所以本镜视频的尾帧画面 = 下一镜视频的首帧画面,拼接天然连续。
|
||||
|
||||
### 第三步:合并成片(自动化)
|
||||
|
||||
所有镜头视频保存为 `./outputs/shot_<id>.mp4` 后,用本 Skill 自带的 `merge_videos.py` 按 storyboard 顺序一键合并成整片(需要本机有 `ffmpeg`,macOS 装法:`brew install ffmpeg`):
|
||||
|
||||
```bash
|
||||
python {baseDir}/scripts/merge_videos.py \
|
||||
--storyboard ./outputs/storyboard.json \
|
||||
--dir ./outputs \
|
||||
--out ./outputs/final.mp4
|
||||
```
|
||||
|
||||
- 脚本按 `storyboard.json` 的 `shots` 顺序找到每个 `shot_<id>.mp4`,归一化分辨率/帧率后无缝拼接,输出 `SAVED: ./outputs/final.mp4`。
|
||||
- **默认硬拼接(cut)即无缝**:因为相邻镜头共享首尾帧,上一镜最后一帧 = 下一镜第一帧,硬拼接看不出接缝。
|
||||
- 想要软溶解可加 `--crossfade 0.5`(每镜之间 0.5 秒 dissolve);但注意交叉溶解会把共享帧重叠掉,通常**不需要**,保持默认硬拼接即可。
|
||||
- 分辨率默认从 `storyboard.json` 的 `aspect` 自动推导(如 `1152x768`),也可用 `--width/--height/--fps` 覆盖。
|
||||
|
||||
> BGM / 字幕在合并后另行叠加(可再用 ffmpeg 或剪辑软件加音轨与字幕)。
|
||||
|
||||
> 关键:agnes-image 的视频参考图**必须是公网 URL**(不支持本地/Base64)。直接用第一步文生图/图生图返回的 `MEDIA_URL` 即可,无需手动上传图床。
|
||||
|
||||
## 注意事项
|
||||
|
||||
- **先出图、再出视频**:视频依赖关键帧的公网 URL,顺序不能反。
|
||||
- **一致性**:在 `style` 里固定统一画风描述;需严格保持人物/场景时,可把上一镜尾帧 URL 作为下一镜的图生图参考。
|
||||
- **画幅对齐**:`aspect` 与视频分辨率对齐能减少裁切;竖屏用 `768x1152`。
|
||||
- **顺序铁律**:先按 `frames` 顺序链式出图、再出视频。`ki` 依赖 `ki.ref` 的 URL,乱序会拿不到参考图。
|
||||
- **共享帧只生成一次**:`k(i)` 出一次图,被 shot i(当尾帧)和 shot (i+1)(当首帧)共用,**不要重复生成两张**——这正是镜头能衔接的根本。
|
||||
- **一致性**:全局 `style` 固定统一画风;链式 `ref` 图生图保持人物/场景;大场景切换放在镜头边界并用 `match cut`/`fade` 过渡。
|
||||
- **校验**:输出前自检 `len(frames) == len(shots) + 1`,且每个 `shots[i].to_frame == shots[i+1].from_frame`。
|
||||
- **画幅对齐**:`aspect` 与视频分辨率对齐能减少裁切;竖屏用 `768x1152`,并把 `--width/--height` 同步成 `768 1152`。
|
||||
- 古诗意象用通用国风描述即可,避免出现真实人物姓名或受版权保护的具体形象。
|
||||
|
||||
173
skills/developing/poem-storyboard/scripts/merge_videos.py
Normal file
173
skills/developing/poem-storyboard/scripts/merge_videos.py
Normal file
@ -0,0 +1,173 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Merge per-shot videos into a single film, ordered by storyboard.json.
|
||||
|
||||
Because adjacent shots share keyframes (shot i's last frame == shot i+1's first
|
||||
frame), a plain hard concat is already seamless. An optional crossfade mode is
|
||||
provided for cases where soft dissolves are explicitly wanted.
|
||||
|
||||
Requires ffmpeg on PATH.
|
||||
|
||||
Examples:
|
||||
# Hard concat in storyboard order (default, seamless via shared frames)
|
||||
python merge_videos.py --storyboard storyboard.json --dir ./outputs --out final.mp4
|
||||
|
||||
# Crossfade between every shot (0.5s dissolve)
|
||||
python merge_videos.py --storyboard storyboard.json --dir ./outputs \
|
||||
--out final.mp4 --crossfade 0.5
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
|
||||
def log(msg):
|
||||
print(msg, file=sys.stderr)
|
||||
|
||||
|
||||
def ensure_ffmpeg():
|
||||
if shutil.which("ffmpeg") is None:
|
||||
log("ERROR: ffmpeg not found on PATH. Install it first "
|
||||
"(macOS: brew install ffmpeg).")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def probe_duration(path):
|
||||
"""Return clip duration in seconds via ffprobe."""
|
||||
out = subprocess.run(
|
||||
["ffprobe", "-v", "error", "-show_entries", "format=duration",
|
||||
"-of", "default=noprint_wrappers=1:nokey=1", path],
|
||||
capture_output=True, text=True, check=True,
|
||||
)
|
||||
return float(out.stdout.strip())
|
||||
|
||||
|
||||
def resolve_shot_files(storyboard, video_dir):
|
||||
"""Map each shot to its video file in storyboard order."""
|
||||
files = []
|
||||
for shot in storyboard["shots"]:
|
||||
sid = shot["id"]
|
||||
# Accept common naming patterns produced by the agnes step.
|
||||
candidates = [
|
||||
f"shot_{sid}.mp4", f"shot{sid}.mp4",
|
||||
f"{sid}.mp4", f"shot_{sid:02d}.mp4",
|
||||
]
|
||||
found = None
|
||||
for name in candidates:
|
||||
p = os.path.join(video_dir, name)
|
||||
if os.path.exists(p):
|
||||
found = p
|
||||
break
|
||||
if not found:
|
||||
log(f"ERROR: no video file for shot id={sid} in {video_dir}. "
|
||||
f"Tried: {candidates}")
|
||||
sys.exit(1)
|
||||
files.append(found)
|
||||
return files
|
||||
|
||||
|
||||
def hard_concat(files, out, width, height, fps):
|
||||
"""Re-encode every clip to a common format, then concat losslessly."""
|
||||
tmpdir = tempfile.mkdtemp(prefix="poem_merge_")
|
||||
normalized = []
|
||||
try:
|
||||
for i, f in enumerate(files):
|
||||
np = os.path.join(tmpdir, f"n_{i:03d}.mp4")
|
||||
subprocess.run(
|
||||
["ffmpeg", "-y", "-i", f,
|
||||
"-vf", f"scale={width}:{height}:force_original_aspect_ratio=decrease,"
|
||||
f"pad={width}:{height}:(ow-iw)/2:(oh-ih)/2,fps={fps}",
|
||||
"-c:v", "libx264", "-pix_fmt", "yuv420p", "-an", np],
|
||||
check=True,
|
||||
)
|
||||
normalized.append(np)
|
||||
listfile = os.path.join(tmpdir, "list.txt")
|
||||
with open(listfile, "w") as fh:
|
||||
for np in normalized:
|
||||
fh.write(f"file '{np}'\n")
|
||||
subprocess.run(
|
||||
["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", listfile,
|
||||
"-c:v", "libx264", "-pix_fmt", "yuv420p", out],
|
||||
check=True,
|
||||
)
|
||||
finally:
|
||||
shutil.rmtree(tmpdir, ignore_errors=True)
|
||||
|
||||
|
||||
def crossfade_concat(files, out, width, height, fps, fade):
|
||||
"""Chain clips with an xfade dissolve of `fade` seconds between each."""
|
||||
norm_filters = []
|
||||
inputs = []
|
||||
for f in files:
|
||||
inputs += ["-i", f]
|
||||
# Normalize each input stream.
|
||||
for i in range(len(files)):
|
||||
norm_filters.append(
|
||||
f"[{i}:v]scale={width}:{height}:force_original_aspect_ratio=decrease,"
|
||||
f"pad={width}:{height}:(ow-iw)/2:(oh-ih)/2,fps={fps},"
|
||||
f"settb=AVTB[v{i}]"
|
||||
)
|
||||
durations = [probe_duration(f) for f in files]
|
||||
chain = []
|
||||
prev = "[v0]"
|
||||
offset = durations[0] - fade
|
||||
for i in range(1, len(files)):
|
||||
out_label = f"[x{i}]" if i < len(files) - 1 else "[vout]"
|
||||
chain.append(
|
||||
f"{prev}[v{i}]xfade=transition=fade:duration={fade}:"
|
||||
f"offset={offset:.3f}{out_label}"
|
||||
)
|
||||
prev = out_label
|
||||
offset += durations[i] - fade
|
||||
filter_complex = ";".join(norm_filters + chain)
|
||||
subprocess.run(
|
||||
["ffmpeg", "-y", *inputs, "-filter_complex", filter_complex,
|
||||
"-map", "[vout]", "-c:v", "libx264", "-pix_fmt", "yuv420p", out],
|
||||
check=True,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(description="Merge per-shot videos by storyboard order.")
|
||||
ap.add_argument("--storyboard", required=True, help="Path to storyboard.json")
|
||||
ap.add_argument("--dir", required=True, help="Directory holding shot_<id>.mp4 files")
|
||||
ap.add_argument("--out", required=True, help="Output merged video path")
|
||||
ap.add_argument("--crossfade", type=float, default=0.0,
|
||||
help="Crossfade seconds between shots (0 = hard concat, default)")
|
||||
ap.add_argument("--width", type=int, default=1152)
|
||||
ap.add_argument("--height", type=int, default=768)
|
||||
ap.add_argument("--fps", type=int, default=24)
|
||||
args = ap.parse_args()
|
||||
|
||||
ensure_ffmpeg()
|
||||
with open(args.storyboard) as fh:
|
||||
storyboard = json.load(fh)
|
||||
|
||||
# Derive resolution from aspect if present (e.g. "1152x768").
|
||||
aspect = storyboard.get("aspect")
|
||||
if aspect and "x" in aspect:
|
||||
try:
|
||||
w, h = aspect.lower().split("x")
|
||||
args.width, args.height = int(w), int(h)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
files = resolve_shot_files(storyboard, args.dir)
|
||||
log(f"Merging {len(files)} shots -> {args.out} "
|
||||
f"({args.width}x{args.height}@{args.fps}fps, "
|
||||
f"{'crossfade ' + str(args.crossfade) + 's' if args.crossfade > 0 else 'hard concat'})")
|
||||
|
||||
if args.crossfade > 0:
|
||||
crossfade_concat(files, args.out, args.width, args.height, args.fps, args.crossfade)
|
||||
else:
|
||||
hard_concat(files, args.out, args.width, args.height, args.fps)
|
||||
|
||||
print(f"SAVED: {args.out}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Loading…
Reference in New Issue
Block a user