Add two-video drone 3DGS pipeline with Apple Silicon fixes

- main.py: extract frames from two videos, run COLMAP feature extraction
- match_features.py: Python-based within-video SIFT matching via OpenCV
  (replaces colmap exhaustive_matcher which segfaults on ARM64 in COLMAP 4.x)
- match_crossvideo.py: exhaustive cross-video matching (v1×v2) to stitch
  two flights into a single COLMAP model
- run.sh: entry point for frame extraction + feature extraction
- train_splat.sh: ns-process-data → splatfacto → .ply export, with
  correct PATH for Homebrew ffmpeg and MPS device flags for Apple Silicon
- .gitignore: exclude videos, generated scene data, venv, logs
- README.md: full pipeline walkthrough, all known issues and fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Jon
2026-05-26 15:09:30 +01:00
parent e0db1edbc6
commit 7f4cdd9459
7 changed files with 782 additions and 0 deletions

24
.gitignore vendored Normal file
View File

@@ -0,0 +1,24 @@
# Videos (too large for git)
*.mp4
*.mov
*.avi
# Generated scene data
my_scene/
outputs/
# Python environment
venv/
__pycache__/
*.pyc
*.pyo
# Logs
*.log
my_scene_build.log
# macOS
.DS_Store
# Editor
.claude/

175
README.md
View File

@@ -0,0 +1,175 @@
# Drone-3DGS
Two-flight DJI drone footage → 3D Gaussian Splatting pipeline for **Apple Silicon Macs** (M1/M2/M3).
Takes two `.mp4` videos of the same scene from different angles, runs Structure-from-Motion via COLMAP, and produces a `.ply` Gaussian splat you can view in a browser.
---
## Requirements
```bash
brew install colmap # COLMAP 4.x (SfM)
brew install ffmpeg # full version with all filters
python3 -m venv venv
source venv/bin/activate
pip install torch torchvision
pip install nerfstudio
```
> **Python version**: 3.10 recommended (tested with 3.10.18 via pyenv).
---
## Project structure
```
.
├── 1.mp4 # first drone flight
├── 2.mp4 # second drone flight
├── main.py # Step 1 extract frames + COLMAP feature extraction
├── match_features.py # Step 2 within-video SIFT matching (Python, bypasses COLMAP crash)
├── match_crossvideo.py # Step 3 cross-video exhaustive matching (v1×v2)
├── run.sh # Runs main.py (frame extraction + feature extraction)
└── train_splat.sh # Steps 46: ns-process-data → splatfacto → export .ply
```
---
## How to run
### Step 1 Extract frames and COLMAP features
```bash
source venv/bin/activate
bash run.sh
```
This calls `main.py` which:
1. Extracts frames from `1.mp4` and `2.mp4` at 2 fps into `my_scene/images/` (named `v1_*.jpg` / `v2_*.jpg`)
2. Runs `colmap feature_extractor` — SIFT features written to `my_scene/database.db`
3. Runs `match_features.py` — sequential within-video matching (overlap=50) via OpenCV BFMatcher
**Why not `colmap exhaustive_matcher`?**
COLMAP 4.x has a threading bug on Apple Silicon ARM64 causing a SIGSEGV in all matcher variants. `match_features.py` replaces it entirely: reads SIFT descriptors from the SQLite database, matches with OpenCV BFMatcher + Lowe ratio test + RANSAC, and writes `two_view_geometries` back to the DB. The mapper only needs that table.
### Step 2 Cross-video matching
```bash
python3 match_crossvideo.py
```
Matches every `v1_*` frame against every `v2_*` frame (14,900 pairs) so the two flights stitch into a single model. Takes ~70 min on M1 Pro CPU (~0.28 s/pair with OpenCV BFMatcher).
### Step 3 COLMAP mapper
```bash
colmap mapper \
--database_path my_scene/database.db \
--image_path my_scene/images \
--output_path my_scene/sparse
```
Produces sparse models in `my_scene/sparse/`. The largest (most registered images) is the one to use. With overlapping flights you should get ~9095% of frames in a single model.
### Steps 46 Convert, train, export
```bash
bash train_splat.sh
```
This script automatically:
1. Finds the largest COLMAP model in `my_scene/sparse/`
2. Converts it to Nerfstudio format with `ns-process-data`
3. Trains `splatfacto`**live viewer at http://localhost:7007 during training**
4. Exports the Gaussian splat to `my_scene/exports/splat.ply`
---
## Known issues and fixes applied
### COLMAP 4.x matcher segfault (Apple Silicon)
All COLMAP matcher variants (`exhaustive_matcher`, `sequential_matcher`, `vocab_tree_matcher`) crash with SIGSEGV on ARM64 due to a bug in the SIFT worker thread initialization. **Fix:** `match_features.py` and `match_crossvideo.py` replace the COLMAP matcher entirely using OpenCV.
### ffmpeg `fps` and `split` filters missing
The nerfstudio-bundled ffmpeg is compiled with a minimal filter set. **Fixes:**
- `main.py` uses `-r` output flag instead of `-vf fps=...`
- `train_splat.sh` prepends `/opt/homebrew/opt/ffmpeg/bin` to `PATH` so `ns-process-data` uses the full Homebrew ffmpeg
### nerfstudio splatfacto hardcoded `.cuda()` calls
Two lines in the installed `splatfacto.py` call `.cuda()` unconditionally. Patched in-place:
| Location | Original | Fix |
|----------|----------|-----|
| `populate_modules()` | `shs = torch.zeros(...).float().cuda()` | `shs = torch.zeros(...).float()` |
| `get_outputs_for_camera()` | `K = ....cuda()` | `K = ....to(self.device)` |
If you reinstall nerfstudio, re-apply with:
```bash
F=venv/lib/python3.10/site-packages/nerfstudio/models/splatfacto.py
sed -i '' 's/\.float()\.cuda()/\.float()/g' "$F"
sed -i '' 's/get_intrinsics_matrices()\.cuda()/get_intrinsics_matrices().to(self.device)/g' "$F"
```
---
## 3DGS on Apple Silicon — current status
`splatfacto` uses **gsplat** as its rasterizer. gsplat 1.x requires CUDA — there is no MPS or CPU fallback. On Apple Silicon the CUDA extension is `None` at load time and crashes at first use.
**Two options for actual Gaussian Splatting:**
### Option A — Brush (recommended, uses Apple Metal natively)
```bash
# Install Rust (one-time)
brew install rustup && rustup-init -y && source ~/.cargo/env
# Build and run
cargo install --git https://github.com/ArthurBrussee/brush brush-cli
brush-cli --source my_scene/sparse/3
```
Outputs a `.ply` and has a built-in web viewer.
### Option B — Google Colab (free GPU)
The scene is already in Nerfstudio format at `my_scene/ns_data/`. Zip it, upload to a Colab T4 instance:
```python
!pip install nerfstudio
!ns-train splatfacto --data /content/ns_data --vis wandb
```
Download `outputs/*/splatfacto/*/splat.ply` when done.
---
## Viewing results
| What | How |
|------|-----|
| During splatfacto training | `http://localhost:7007` (Nerfstudio Viser viewer) |
| Sparse point cloud (ready now) | Drag `my_scene/exports/sparse_pointcloud.ply` into https://playcanvas.com/supersplat/editor |
| Final Gaussian splat | Drag `my_scene/exports/splat.ply` into https://playcanvas.com/supersplat/editor |
PlayCanvas SuperSplat runs 100% in-browser — the file never leaves your machine.
---
## Re-running from scratch
```bash
rm -rf my_scene outputs
bash run.sh # frames + features (~10 min)
python3 match_crossvideo.py # cross-video matching (~70 min)
colmap mapper \
--database_path my_scene/database.db \
--image_path my_scene/images \
--output_path my_scene/sparse # mapping (~10 min)
bash train_splat.sh # convert + train + export
```

155
main.py Normal file
View File

@@ -0,0 +1,155 @@
#!/usr/bin/env python3
"""
Two-video drone footage -> 3DGS pipeline for Apple Silicon Macs.
Usage:
python drone_3dgs_pipeline.py \
--video1 path/to/flight1.mp4 \
--video2 path/to/flight2.mp4 \
--output_dir ./my_scene \
--fps 2
What it does:
1. Extracts frames from both videos at the given fps (default 2 fps).
2. Pools them into one folder with non-colliding names.
3. Runs COLMAP feature extraction, matching, and sparse reconstruction.
4. Hands you a Nerfstudio-ready folder structure to train splatfacto.
Then run:
ns-train splatfacto --data ./my_scene
"""
import argparse
import shutil
import subprocess
import sys
from pathlib import Path
def run(cmd, check=True):
"""Run a shell command, streaming output."""
print(f"\n>>> {' '.join(str(c) for c in cmd)}\n")
result = subprocess.run(cmd, check=check)
return result
def check_dependencies():
"""Verify ffmpeg and colmap are installed."""
for tool in ("ffmpeg", "colmap"):
if shutil.which(tool) is None:
sys.exit(f"ERROR: {tool} not found. Install with: brew install {tool}")
def extract_frames(video_path: Path, out_dir: Path, fps: float, prefix: str):
"""Extract frames from a video using ffmpeg at the given fps."""
out_dir.mkdir(parents=True, exist_ok=True)
pattern = str(out_dir / f"{prefix}_%05d.jpg")
run([
"ffmpeg", "-i", str(video_path),
"-r", str(fps),
"-q:v", "2", # JPEG quality (2 = high)
"-y", # overwrite
pattern,
])
count = len(list(out_dir.glob(f"{prefix}_*.jpg")))
print(f"Extracted {count} frames from {video_path.name} -> {out_dir}")
return count
def run_colmap(workspace: Path, images_dir: Path, use_gpu: bool = False):
"""Run the COLMAP SfM pipeline: feature extraction, matching, mapping."""
sparse_dir = workspace / "sparse"
sparse_dir.mkdir(parents=True, exist_ok=True)
db_path = workspace / "database.db"
# Step 1: feature extraction (COLMAP 4.x renamed SiftExtraction → FeatureExtraction)
run([
"colmap", "feature_extractor",
"--database_path", str(db_path),
"--image_path", str(images_dir),
"--ImageReader.single_camera", "1",
"--ImageReader.camera_model", "OPENCV",
"--FeatureExtraction.use_gpu", "1" if use_gpu else "0",
])
# Step 2: Python matcher (COLMAP 4.x exhaustive_matcher segfaults on Apple Silicon ARM64;
# match_features.py reads SIFT descriptors from the DB and matches via OpenCV BFMatcher)
script = Path(__file__).parent / "match_features.py"
run([sys.executable, str(script), "--db", str(db_path)])
# Step 3: sparse reconstruction (this is the slow one)
run([
"colmap", "mapper",
"--database_path", str(db_path),
"--image_path", str(images_dir),
"--output_path", str(sparse_dir),
])
# COLMAP writes to sparse/0/ by default
print(f"\nCOLMAP done. Reconstruction in {sparse_dir}/0/")
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--video1", type=Path, required=True)
parser.add_argument("--video2", type=Path, required=True)
parser.add_argument("--output_dir", type=Path, required=True)
parser.add_argument("--fps", type=float, default=2.0,
help="Frames per second to extract (default: 2). "
"Higher = more frames, slower training, better quality.")
parser.add_argument("--use_gpu", action="store_true",
help="Try GPU SIFT in COLMAP (often unreliable on M1; default off).")
args = parser.parse_args()
check_dependencies()
if not args.video1.exists() or not args.video2.exists():
sys.exit("ERROR: one or both video files not found.")
workspace = args.output_dir
images_dir = workspace / "images"
workspace.mkdir(parents=True, exist_ok=True)
# Extract frames from both videos into the SAME images folder, with prefixes
# so filenames don't collide. COLMAP treats them as one set automatically.
print(f"\n=== Extracting frames from {args.video1.name} ===")
n1 = extract_frames(args.video1, images_dir, args.fps, prefix="v1")
print(f"\n=== Extracting frames from {args.video2.name} ===")
n2 = extract_frames(args.video2, images_dir, args.fps, prefix="v2")
total = n1 + n2
print(f"\n=== Total frames: {total} ===")
if total > 800:
print("WARNING: lots of frames. Consider lowering --fps. "
"Exhaustive matching will be slow; switch to sequential_matcher if needed.")
# Run COLMAP
print("\n=== Running COLMAP (this is the slow part, get a coffee) ===")
run_colmap(workspace, images_dir, use_gpu=args.use_gpu)
# Print next steps
print(f"""
========================================================================
DONE with SfM. Your scene is at: {workspace}
Next, train the splat. Two options on Mac:
OPTION 1: Nerfstudio (Python, scriptable)
pip install nerfstudio
ns-process-data images --data {images_dir} --output-dir {workspace}/ns_data \\
--skip-colmap --colmap-model-path {workspace}/sparse/0
ns-train splatfacto --data {workspace}/ns_data
OPTION 2: Brush (Rust binary, faster on Mac)
Download from https://github.com/ArthurBrussee/brush
brush --source {workspace}
View result:
- Online: https://playcanvas.com/supersplat/editor (drag .ply in)
- Local: install SuperSplat or use Nerfstudio's viewer
========================================================================
""")
if __name__ == "__main__":
main()

131
match_crossvideo.py Normal file
View File

@@ -0,0 +1,131 @@
#!/usr/bin/env python3
"""
Cross-video exhaustive matching for two-flight drone footage.
Matches every v1_* frame against every v2_* frame. The within-video
matches from match_features.py are already in the database and are not
touched. After this script, re-run colmap mapper to stitch the scene.
"""
import argparse
import sqlite3
import numpy as np
import cv2
from pathlib import Path
MIN_INLIERS = 15
RATIO_TEST = 0.75
RANSAC_ERROR = 4.0
KMAX = 2_147_483_647
def pair_id(id1: int, id2: int) -> int:
lo, hi = (id1, id2) if id1 < id2 else (id2, id1)
return KMAX * lo + hi
def load_desc_kpts(cur, image_id):
cur.execute("SELECT rows, cols, data FROM descriptors WHERE image_id=?", (image_id,))
r = cur.fetchone()
desc = np.frombuffer(r[2], dtype=np.uint8).reshape(r[0], r[1]) if r else np.zeros((0,128), dtype=np.uint8)
cur.execute("SELECT rows, cols, data FROM keypoints WHERE image_id=?", (image_id,))
r = cur.fetchone()
if r:
kp = np.frombuffer(r[2], dtype=np.float32).reshape(r[0], r[1])
kpts = kp[:, :2]
else:
kpts = np.zeros((0, 2), dtype=np.float32)
return desc, kpts
def match_pair(desc1, desc2, kp1, kp2):
if len(desc1) < 8 or len(desc2) < 8:
return None, None
bf = cv2.BFMatcher(cv2.NORM_L2)
raw = bf.knnMatch(desc1.astype(np.float32), desc2.astype(np.float32), k=2)
good = []
for m_pair in raw:
if len(m_pair) == 2:
m, n = m_pair
if m.distance < RATIO_TEST * n.distance:
good.append((m.queryIdx, m.trainIdx))
if len(good) < MIN_INLIERS:
return None, None
arr = np.array(good, dtype=np.uint32)
pts1 = kp1[arr[:, 0]]
pts2 = kp2[arr[:, 1]]
F, mask = cv2.findFundamentalMat(
pts1, pts2, cv2.FM_RANSAC,
ransacReprojThreshold=RANSAC_ERROR,
confidence=0.9999, maxIters=2000,
)
if F is None or mask is None:
return None, None
inliers = arr[mask.ravel().astype(bool)]
return (inliers, F) if len(inliers) >= MIN_INLIERS else (None, None)
def write_pair(cur, pid, inliers, F):
blob = inliers.astype(np.uint32).tobytes()
z9 = np.zeros(9, dtype=np.float64).tobytes()
z4 = np.zeros(4, dtype=np.float64).tobytes()
z3 = np.zeros(3, dtype=np.float64).tobytes()
F_blob = F.flatten().astype(np.float64).tobytes()
cur.execute(
"INSERT OR REPLACE INTO matches (pair_id, rows, cols, data) VALUES (?,?,?,?)",
(pid, len(inliers), 2, blob),
)
cur.execute(
"INSERT OR REPLACE INTO two_view_geometries "
"(pair_id, rows, cols, data, config, F, E, H, qvec, tvec) VALUES (?,?,?,?,?,?,?,?,?,?)",
(pid, len(inliers), 2, blob, 3, F_blob, z9, z9, z4, z3),
)
def main():
p = argparse.ArgumentParser()
p.add_argument("--db", default="my_scene/database.db")
args = p.parse_args()
db = sqlite3.connect(args.db)
db.execute("PRAGMA journal_mode=WAL")
cur = db.cursor()
cur.execute("SELECT image_id, name FROM images ORDER BY name")
rows = cur.fetchall()
v1 = [(id, name) for id, name in rows if name.startswith("v1_")]
v2 = [(id, name) for id, name in rows if name.startswith("v2_")]
total_pairs = len(v1) * len(v2)
print(f"v1={len(v1)} frames v2={len(v2)} frames cross-pairs={total_pairs}")
# Preload all v1 and v2 descriptors into RAM
print("Loading v1 descriptors…")
v1_data = {id: load_desc_kpts(cur, id) for id, _ in v1}
print("Loading v2 descriptors…")
v2_data = {id: load_desc_kpts(cur, id) for id, _ in v2}
matched = skipped = i = 0
for id1, _ in v1:
desc1, kp1 = v1_data[id1]
for id2, _ in v2:
desc2, kp2 = v2_data[id2]
inliers, F = match_pair(desc1, desc2, kp1, kp2)
if inliers is not None:
write_pair(cur, pair_id(id1, id2), inliers, F)
matched += 1
else:
skipped += 1
i += 1
if i % 500 == 0:
pct = 100 * i / total_pairs
print(f" [{i}/{total_pairs} {pct:.0f}%] cross-matched={matched}", flush=True)
db.commit()
db.commit()
db.close()
print(f"\nDone. {matched} cross-video pairs matched, {skipped} below threshold.")
print("Now delete my_scene/sparse/* and re-run colmap mapper.")
if __name__ == "__main__":
main()

172
match_features.py Normal file
View File

@@ -0,0 +1,172 @@
#!/usr/bin/env python3
"""
Python replacement for COLMAP's crashing exhaustive_matcher on Apple Silicon.
Reads SIFT features from the COLMAP SQLite database, matches them with
OpenCV BFMatcher (Lowe ratio test), verifies with RANSAC, and writes
matches + two_view_geometries back to the database.
COLMAP's mapper reads two_view_geometries — no need to re-run any COLMAP
matcher binary after this script.
"""
import argparse
import sqlite3
import numpy as np
import cv2
import sys
from pathlib import Path
MIN_INLIERS = 15 # reject pairs with fewer verified matches
RATIO_TEST = 0.75 # Lowe's ratio threshold
RANSAC_ERROR = 4.0 # max reprojection error in pixels for RANSAC
# COLMAP 4.x pair_id formula: kMaxNumImages * min(id1,id2) + max(id1,id2)
KMAX = 2_147_483_647
def pair_id(id1: int, id2: int) -> int:
lo, hi = (id1, id2) if id1 < id2 else (id2, id1)
return KMAX * lo + hi
def read_images(cur):
cur.execute("SELECT image_id, name FROM images ORDER BY name")
return cur.fetchall() # [(image_id, name), ...]
def load_all(cur, image_ids):
descs, kpts = {}, {}
for iid in image_ids:
cur.execute("SELECT rows, cols, data FROM descriptors WHERE image_id=?", (iid,))
r = cur.fetchone()
if r:
descs[iid] = np.frombuffer(r[2], dtype=np.uint8).reshape(r[0], r[1])
else:
descs[iid] = np.zeros((0, 128), dtype=np.uint8)
cur.execute("SELECT rows, cols, data FROM keypoints WHERE image_id=?", (iid,))
r = cur.fetchone()
if r:
kp = np.frombuffer(r[2], dtype=np.float32).reshape(r[0], r[1])
kpts[iid] = kp[:, :2] # x, y
else:
kpts[iid] = np.zeros((0, 2), dtype=np.float32)
return descs, kpts
def match_pair(desc1, desc2, kp1, kp2):
if len(desc1) < 8 or len(desc2) < 8:
return None, None
bf = cv2.BFMatcher(cv2.NORM_L2)
raw = bf.knnMatch(desc1.astype(np.float32), desc2.astype(np.float32), k=2)
good = []
for m_pair in raw:
if len(m_pair) == 2:
m, n = m_pair
if m.distance < RATIO_TEST * n.distance:
good.append((m.queryIdx, m.trainIdx))
if len(good) < MIN_INLIERS:
return None, None
arr = np.array(good, dtype=np.uint32)
pts1 = kp1[arr[:, 0]]
pts2 = kp2[arr[:, 1]]
F, mask = cv2.findFundamentalMat(
pts1, pts2, cv2.FM_RANSAC,
ransacReprojThreshold=RANSAC_ERROR,
confidence=0.9999,
maxIters=2000,
)
if F is None or mask is None:
return None, None
inliers = arr[mask.ravel().astype(bool)]
if len(inliers) < MIN_INLIERS:
return None, None
return inliers, F
def write_pair(cur, pid, inliers, F):
blob = inliers.astype(np.uint32).tobytes()
zeros9 = np.zeros(9, dtype=np.float64).tobytes()
zeros4 = np.zeros(4, dtype=np.float64).tobytes()
zeros3 = np.zeros(3, dtype=np.float64).tobytes()
F_blob = F.flatten().astype(np.float64).tobytes()
cur.execute(
"INSERT OR REPLACE INTO matches (pair_id, rows, cols, data) VALUES (?,?,?,?)",
(pid, len(inliers), 2, blob),
)
cur.execute(
"INSERT OR REPLACE INTO two_view_geometries "
"(pair_id, rows, cols, data, config, F, E, H, qvec, tvec) "
"VALUES (?,?,?,?,?,?,?,?,?,?)",
(pid, len(inliers), 2, blob,
3, # UNCALIBRATED — uses F matrix
F_blob, zeros9, zeros9, zeros4, zeros3),
)
def sequential_pairs(ids, overlap):
pairs = []
n = len(ids)
for i in range(n):
for j in range(i + 1, min(i + overlap + 1, n)):
pairs.append((ids[i], ids[j]))
return pairs
def main():
p = argparse.ArgumentParser()
p.add_argument("--db", default="my_scene/database.db")
p.add_argument("--overlap", type=int, default=50)
args = p.parse_args()
db_path = args.db
db = sqlite3.connect(db_path)
db.execute("PRAGMA journal_mode=WAL")
cur = db.cursor()
images = read_images(cur)
ids = [r[0] for r in images]
print(f"Images: {len(ids)}")
print("Loading descriptors & keypoints into memory…")
descs, kpts = load_all(cur, ids)
total_feats = sum(len(d) for d in descs.values())
print(f"Loaded {total_feats:,} keypoints total")
overlap = args.overlap
pairs = sequential_pairs(ids, overlap)
print(f"Pairs to match: {len(pairs)} (sequential overlap={overlap})")
matched = skipped = 0
for i, (id1, id2) in enumerate(pairs):
if i % 200 == 0:
pct = 100 * i / len(pairs)
print(f" [{i}/{len(pairs)} {pct:.0f}%] matched={matched}", flush=True)
inliers, F = match_pair(descs[id1], descs[id2], kpts[id1], kpts[id2])
if inliers is not None:
write_pair(cur, pair_id(id1, id2), inliers, F)
matched += 1
else:
skipped += 1
if i % 500 == 0:
db.commit()
db.commit()
db.close()
print(f"\nDone. {matched} pairs matched, {skipped} below threshold.")
print(f"Now run: colmap mapper --database_path {db_path} "
f"--image_path my_scene/images --output_path my_scene/sparse")
if __name__ == "__main__":
main()

5
run.sh Executable file
View File

@@ -0,0 +1,5 @@
python main.py \
--video1 1.mp4 \
--video2 2.mp4 \
--output_dir ./my_scene \
--fps 2

120
train_splat.sh Executable file
View File

@@ -0,0 +1,120 @@
#!/usr/bin/env bash
# Re-runnable pipeline: COLMAP output → Nerfstudio → splatfacto → .ply
# Skips COLMAP (assumes my_scene/sparse/0/ already exists).
set -euo pipefail
cd "$(dirname "$0")"
source venv/bin/activate
# Use the full Homebrew ffmpeg (nerfstudio's bundled one lacks split/fps filters)
export PATH="/opt/homebrew/opt/ffmpeg/bin:$PATH"
SCENE=my_scene
NS_DATA=$SCENE/ns_data
EXPORT_DIR=$SCENE/exports
PLY=$EXPORT_DIR/splat.ply
# ── 1. Verify COLMAP output ────────────────────────────────────────────────
echo ""
echo "=== Step 1: Verifying COLMAP output ==="
if [ ! -d "$SCENE/sparse" ] || [ -z "$(ls -A $SCENE/sparse 2>/dev/null)" ]; then
echo "ERROR: $SCENE/sparse/ not found or empty. Run main.py + match_crossvideo.py first."
exit 1
fi
# Pick the model with the most registered images
BEST_MODEL=$(python3 -c "
import struct, os, sys
best_dir, best_imgs = '', 0
for m in sorted(os.listdir('$SCENE/sparse')):
d = '$SCENE/sparse/' + m
f = d + '/images.bin'
if not os.path.isfile(f): continue
with open(f,'rb') as fh: n = struct.unpack('<Q', fh.read(8))[0]
if n > best_imgs: best_imgs, best_dir = n, d
print(best_dir)
")
if [ -z "$BEST_MODEL" ]; then
echo "ERROR: no valid COLMAP model found in $SCENE/sparse/"
exit 1
fi
for f in cameras.bin images.bin points3D.bin; do
if [ ! -f "$BEST_MODEL/$f" ]; then
echo "ERROR: missing $BEST_MODEL/$f"
exit 1
fi
done
NUM_IMGS=$(python3 -c "import struct; f=open('$BEST_MODEL/images.bin','rb'); print(struct.unpack('<Q',f.read(8))[0])")
NUM_PTS=$(python3 -c "import struct; f=open('$BEST_MODEL/points3D.bin','rb'); print(struct.unpack('<Q',f.read(8))[0])")
echo " Best model: $BEST_MODEL (images=$NUM_IMGS points3D=$NUM_PTS)"
num_models=$(find "$SCENE/sparse" -mindepth 1 -maxdepth 1 -type d | wc -l | tr -d ' ')
if [ "$num_models" -gt 1 ]; then
echo " WARNING: $num_models disconnected models — using largest ($BEST_MODEL)."
echo " Run match_crossvideo.py and re-map to attempt a full stitch."
fi
# ── 2. Convert COLMAP → Nerfstudio format ─────────────────────────────────
echo ""
echo "=== Step 2: ns-process-data (COLMAP → Nerfstudio) ==="
ns-process-data images \
--data "$(pwd)/$SCENE/images" \
--output-dir "$(pwd)/$NS_DATA" \
--skip-colmap \
--colmap-model-path "$(pwd)/$BEST_MODEL"
# ── 3. Train splatfacto with browser viewer ────────────────────────────────
echo ""
echo "=== Step 3: Training splatfacto ==="
echo ""
echo " ┌──────────────────────────────────────────────────────┐"
echo " │ Live viewer (fly around during training): │"
echo " │ http://localhost:7007 │"
echo " └──────────────────────────────────────────────────────┘"
echo ""
ns-train splatfacto \
--data "$NS_DATA" \
--vis viewer \
--viewer.quit-on-train-completion True
# ── 4. Find the latest training output ────────────────────────────────────
TRAIN_OUT=$(ls -td outputs/*/splatfacto/*/ 2>/dev/null | head -1)
if [ -z "$TRAIN_OUT" ]; then
echo "ERROR: could not find training output folder under outputs/"
exit 1
fi
CONFIG_PATH="$TRAIN_OUT/config.yml"
echo " Training output: $TRAIN_OUT"
# ── 5. Export .ply ────────────────────────────────────────────────────────
echo ""
echo "=== Step 4: Exporting Gaussian splat to .ply ==="
mkdir -p "$EXPORT_DIR"
ns-export gaussian-splat \
--load-config "$CONFIG_PATH" \
--output-dir "$EXPORT_DIR"
# ── 6. Final summary ──────────────────────────────────────────────────────
echo ""
echo "======================================================================"
if [ -f "$PLY" ]; then
echo " .ply exported: $(pwd)/$PLY"
else
echo " WARNING: splat.ply not found at $PLY — check $EXPORT_DIR/"
fi
echo ""
echo " View during training : http://localhost:7007"
echo ""
echo " View final .ply (Option A — recommended):"
echo " Drag $(pwd)/$PLY into:"
echo " https://playcanvas.com/supersplat/editor"
echo " Runs 100% in-browser; the file stays on your machine."
echo ""
echo " View final .ply (Option B — fully offline):"
echo " python3 -m http.server 8080 --directory \$(dirname $PLY)"
echo " Then open http://localhost:8080/splat.ply in gsplat viewer"
echo " (requires a separate gsplat.js page — Option A is simpler)"
echo "======================================================================"