Add two-video drone 3DGS pipeline with Apple Silicon fixes
- main.py: extract frames from two videos, run COLMAP feature extraction - match_features.py: Python-based within-video SIFT matching via OpenCV (replaces colmap exhaustive_matcher which segfaults on ARM64 in COLMAP 4.x) - match_crossvideo.py: exhaustive cross-video matching (v1×v2) to stitch two flights into a single COLMAP model - run.sh: entry point for frame extraction + feature extraction - train_splat.sh: ns-process-data → splatfacto → .ply export, with correct PATH for Homebrew ffmpeg and MPS device flags for Apple Silicon - .gitignore: exclude videos, generated scene data, venv, logs - README.md: full pipeline walkthrough, all known issues and fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
24
.gitignore
vendored
Normal file
24
.gitignore
vendored
Normal file
@@ -0,0 +1,24 @@
|
||||
# Videos (too large for git)
|
||||
*.mp4
|
||||
*.mov
|
||||
*.avi
|
||||
|
||||
# Generated scene data
|
||||
my_scene/
|
||||
outputs/
|
||||
|
||||
# Python environment
|
||||
venv/
|
||||
__pycache__/
|
||||
*.pyc
|
||||
*.pyo
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
my_scene_build.log
|
||||
|
||||
# macOS
|
||||
.DS_Store
|
||||
|
||||
# Editor
|
||||
.claude/
|
||||
175
README.md
175
README.md
@@ -0,0 +1,175 @@
|
||||
# Drone-3DGS
|
||||
|
||||
Two-flight DJI drone footage → 3D Gaussian Splatting pipeline for **Apple Silicon Macs** (M1/M2/M3).
|
||||
|
||||
Takes two `.mp4` videos of the same scene from different angles, runs Structure-from-Motion via COLMAP, and produces a `.ply` Gaussian splat you can view in a browser.
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
```bash
|
||||
brew install colmap # COLMAP 4.x (SfM)
|
||||
brew install ffmpeg # full version with all filters
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate
|
||||
pip install torch torchvision
|
||||
pip install nerfstudio
|
||||
```
|
||||
|
||||
> **Python version**: 3.10 recommended (tested with 3.10.18 via pyenv).
|
||||
|
||||
---
|
||||
|
||||
## Project structure
|
||||
|
||||
```
|
||||
.
|
||||
├── 1.mp4 # first drone flight
|
||||
├── 2.mp4 # second drone flight
|
||||
├── main.py # Step 1 – extract frames + COLMAP feature extraction
|
||||
├── match_features.py # Step 2 – within-video SIFT matching (Python, bypasses COLMAP crash)
|
||||
├── match_crossvideo.py # Step 3 – cross-video exhaustive matching (v1×v2)
|
||||
├── run.sh # Runs main.py (frame extraction + feature extraction)
|
||||
└── train_splat.sh # Steps 4–6: ns-process-data → splatfacto → export .ply
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How to run
|
||||
|
||||
### Step 1 – Extract frames and COLMAP features
|
||||
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
bash run.sh
|
||||
```
|
||||
|
||||
This calls `main.py` which:
|
||||
1. Extracts frames from `1.mp4` and `2.mp4` at 2 fps into `my_scene/images/` (named `v1_*.jpg` / `v2_*.jpg`)
|
||||
2. Runs `colmap feature_extractor` — SIFT features written to `my_scene/database.db`
|
||||
3. Runs `match_features.py` — sequential within-video matching (overlap=50) via OpenCV BFMatcher
|
||||
|
||||
**Why not `colmap exhaustive_matcher`?**
|
||||
COLMAP 4.x has a threading bug on Apple Silicon ARM64 causing a SIGSEGV in all matcher variants. `match_features.py` replaces it entirely: reads SIFT descriptors from the SQLite database, matches with OpenCV BFMatcher + Lowe ratio test + RANSAC, and writes `two_view_geometries` back to the DB. The mapper only needs that table.
|
||||
|
||||
### Step 2 – Cross-video matching
|
||||
|
||||
```bash
|
||||
python3 match_crossvideo.py
|
||||
```
|
||||
|
||||
Matches every `v1_*` frame against every `v2_*` frame (14,900 pairs) so the two flights stitch into a single model. Takes ~70 min on M1 Pro CPU (~0.28 s/pair with OpenCV BFMatcher).
|
||||
|
||||
### Step 3 – COLMAP mapper
|
||||
|
||||
```bash
|
||||
colmap mapper \
|
||||
--database_path my_scene/database.db \
|
||||
--image_path my_scene/images \
|
||||
--output_path my_scene/sparse
|
||||
```
|
||||
|
||||
Produces sparse models in `my_scene/sparse/`. The largest (most registered images) is the one to use. With overlapping flights you should get ~90–95% of frames in a single model.
|
||||
|
||||
### Steps 4–6 – Convert, train, export
|
||||
|
||||
```bash
|
||||
bash train_splat.sh
|
||||
```
|
||||
|
||||
This script automatically:
|
||||
1. Finds the largest COLMAP model in `my_scene/sparse/`
|
||||
2. Converts it to Nerfstudio format with `ns-process-data`
|
||||
3. Trains `splatfacto` — **live viewer at http://localhost:7007 during training**
|
||||
4. Exports the Gaussian splat to `my_scene/exports/splat.ply`
|
||||
|
||||
---
|
||||
|
||||
## Known issues and fixes applied
|
||||
|
||||
### COLMAP 4.x matcher segfault (Apple Silicon)
|
||||
|
||||
All COLMAP matcher variants (`exhaustive_matcher`, `sequential_matcher`, `vocab_tree_matcher`) crash with SIGSEGV on ARM64 due to a bug in the SIFT worker thread initialization. **Fix:** `match_features.py` and `match_crossvideo.py` replace the COLMAP matcher entirely using OpenCV.
|
||||
|
||||
### ffmpeg `fps` and `split` filters missing
|
||||
|
||||
The nerfstudio-bundled ffmpeg is compiled with a minimal filter set. **Fixes:**
|
||||
- `main.py` uses `-r` output flag instead of `-vf fps=...`
|
||||
- `train_splat.sh` prepends `/opt/homebrew/opt/ffmpeg/bin` to `PATH` so `ns-process-data` uses the full Homebrew ffmpeg
|
||||
|
||||
### nerfstudio splatfacto hardcoded `.cuda()` calls
|
||||
|
||||
Two lines in the installed `splatfacto.py` call `.cuda()` unconditionally. Patched in-place:
|
||||
|
||||
| Location | Original | Fix |
|
||||
|----------|----------|-----|
|
||||
| `populate_modules()` | `shs = torch.zeros(...).float().cuda()` | `shs = torch.zeros(...).float()` |
|
||||
| `get_outputs_for_camera()` | `K = ....cuda()` | `K = ....to(self.device)` |
|
||||
|
||||
If you reinstall nerfstudio, re-apply with:
|
||||
|
||||
```bash
|
||||
F=venv/lib/python3.10/site-packages/nerfstudio/models/splatfacto.py
|
||||
sed -i '' 's/\.float()\.cuda()/\.float()/g' "$F"
|
||||
sed -i '' 's/get_intrinsics_matrices()\.cuda()/get_intrinsics_matrices().to(self.device)/g' "$F"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3DGS on Apple Silicon — current status
|
||||
|
||||
`splatfacto` uses **gsplat** as its rasterizer. gsplat 1.x requires CUDA — there is no MPS or CPU fallback. On Apple Silicon the CUDA extension is `None` at load time and crashes at first use.
|
||||
|
||||
**Two options for actual Gaussian Splatting:**
|
||||
|
||||
### Option A — Brush (recommended, uses Apple Metal natively)
|
||||
|
||||
```bash
|
||||
# Install Rust (one-time)
|
||||
brew install rustup && rustup-init -y && source ~/.cargo/env
|
||||
|
||||
# Build and run
|
||||
cargo install --git https://github.com/ArthurBrussee/brush brush-cli
|
||||
brush-cli --source my_scene/sparse/3
|
||||
```
|
||||
|
||||
Outputs a `.ply` and has a built-in web viewer.
|
||||
|
||||
### Option B — Google Colab (free GPU)
|
||||
|
||||
The scene is already in Nerfstudio format at `my_scene/ns_data/`. Zip it, upload to a Colab T4 instance:
|
||||
|
||||
```python
|
||||
!pip install nerfstudio
|
||||
!ns-train splatfacto --data /content/ns_data --vis wandb
|
||||
```
|
||||
|
||||
Download `outputs/*/splatfacto/*/splat.ply` when done.
|
||||
|
||||
---
|
||||
|
||||
## Viewing results
|
||||
|
||||
| What | How |
|
||||
|------|-----|
|
||||
| During splatfacto training | `http://localhost:7007` (Nerfstudio Viser viewer) |
|
||||
| Sparse point cloud (ready now) | Drag `my_scene/exports/sparse_pointcloud.ply` into https://playcanvas.com/supersplat/editor |
|
||||
| Final Gaussian splat | Drag `my_scene/exports/splat.ply` into https://playcanvas.com/supersplat/editor |
|
||||
|
||||
PlayCanvas SuperSplat runs 100% in-browser — the file never leaves your machine.
|
||||
|
||||
---
|
||||
|
||||
## Re-running from scratch
|
||||
|
||||
```bash
|
||||
rm -rf my_scene outputs
|
||||
bash run.sh # frames + features (~10 min)
|
||||
python3 match_crossvideo.py # cross-video matching (~70 min)
|
||||
colmap mapper \
|
||||
--database_path my_scene/database.db \
|
||||
--image_path my_scene/images \
|
||||
--output_path my_scene/sparse # mapping (~10 min)
|
||||
bash train_splat.sh # convert + train + export
|
||||
```
|
||||
|
||||
155
main.py
Normal file
155
main.py
Normal file
@@ -0,0 +1,155 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Two-video drone footage -> 3DGS pipeline for Apple Silicon Macs.
|
||||
|
||||
Usage:
|
||||
python drone_3dgs_pipeline.py \
|
||||
--video1 path/to/flight1.mp4 \
|
||||
--video2 path/to/flight2.mp4 \
|
||||
--output_dir ./my_scene \
|
||||
--fps 2
|
||||
|
||||
What it does:
|
||||
1. Extracts frames from both videos at the given fps (default 2 fps).
|
||||
2. Pools them into one folder with non-colliding names.
|
||||
3. Runs COLMAP feature extraction, matching, and sparse reconstruction.
|
||||
4. Hands you a Nerfstudio-ready folder structure to train splatfacto.
|
||||
|
||||
Then run:
|
||||
ns-train splatfacto --data ./my_scene
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def run(cmd, check=True):
|
||||
"""Run a shell command, streaming output."""
|
||||
print(f"\n>>> {' '.join(str(c) for c in cmd)}\n")
|
||||
result = subprocess.run(cmd, check=check)
|
||||
return result
|
||||
|
||||
|
||||
def check_dependencies():
|
||||
"""Verify ffmpeg and colmap are installed."""
|
||||
for tool in ("ffmpeg", "colmap"):
|
||||
if shutil.which(tool) is None:
|
||||
sys.exit(f"ERROR: {tool} not found. Install with: brew install {tool}")
|
||||
|
||||
|
||||
def extract_frames(video_path: Path, out_dir: Path, fps: float, prefix: str):
|
||||
"""Extract frames from a video using ffmpeg at the given fps."""
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
pattern = str(out_dir / f"{prefix}_%05d.jpg")
|
||||
run([
|
||||
"ffmpeg", "-i", str(video_path),
|
||||
"-r", str(fps),
|
||||
"-q:v", "2", # JPEG quality (2 = high)
|
||||
"-y", # overwrite
|
||||
pattern,
|
||||
])
|
||||
count = len(list(out_dir.glob(f"{prefix}_*.jpg")))
|
||||
print(f"Extracted {count} frames from {video_path.name} -> {out_dir}")
|
||||
return count
|
||||
|
||||
|
||||
def run_colmap(workspace: Path, images_dir: Path, use_gpu: bool = False):
|
||||
"""Run the COLMAP SfM pipeline: feature extraction, matching, mapping."""
|
||||
sparse_dir = workspace / "sparse"
|
||||
sparse_dir.mkdir(parents=True, exist_ok=True)
|
||||
db_path = workspace / "database.db"
|
||||
|
||||
# Step 1: feature extraction (COLMAP 4.x renamed SiftExtraction → FeatureExtraction)
|
||||
run([
|
||||
"colmap", "feature_extractor",
|
||||
"--database_path", str(db_path),
|
||||
"--image_path", str(images_dir),
|
||||
"--ImageReader.single_camera", "1",
|
||||
"--ImageReader.camera_model", "OPENCV",
|
||||
"--FeatureExtraction.use_gpu", "1" if use_gpu else "0",
|
||||
])
|
||||
|
||||
# Step 2: Python matcher (COLMAP 4.x exhaustive_matcher segfaults on Apple Silicon ARM64;
|
||||
# match_features.py reads SIFT descriptors from the DB and matches via OpenCV BFMatcher)
|
||||
script = Path(__file__).parent / "match_features.py"
|
||||
run([sys.executable, str(script), "--db", str(db_path)])
|
||||
|
||||
# Step 3: sparse reconstruction (this is the slow one)
|
||||
run([
|
||||
"colmap", "mapper",
|
||||
"--database_path", str(db_path),
|
||||
"--image_path", str(images_dir),
|
||||
"--output_path", str(sparse_dir),
|
||||
])
|
||||
|
||||
# COLMAP writes to sparse/0/ by default
|
||||
print(f"\nCOLMAP done. Reconstruction in {sparse_dir}/0/")
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--video1", type=Path, required=True)
|
||||
parser.add_argument("--video2", type=Path, required=True)
|
||||
parser.add_argument("--output_dir", type=Path, required=True)
|
||||
parser.add_argument("--fps", type=float, default=2.0,
|
||||
help="Frames per second to extract (default: 2). "
|
||||
"Higher = more frames, slower training, better quality.")
|
||||
parser.add_argument("--use_gpu", action="store_true",
|
||||
help="Try GPU SIFT in COLMAP (often unreliable on M1; default off).")
|
||||
args = parser.parse_args()
|
||||
|
||||
check_dependencies()
|
||||
|
||||
if not args.video1.exists() or not args.video2.exists():
|
||||
sys.exit("ERROR: one or both video files not found.")
|
||||
|
||||
workspace = args.output_dir
|
||||
images_dir = workspace / "images"
|
||||
workspace.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Extract frames from both videos into the SAME images folder, with prefixes
|
||||
# so filenames don't collide. COLMAP treats them as one set automatically.
|
||||
print(f"\n=== Extracting frames from {args.video1.name} ===")
|
||||
n1 = extract_frames(args.video1, images_dir, args.fps, prefix="v1")
|
||||
print(f"\n=== Extracting frames from {args.video2.name} ===")
|
||||
n2 = extract_frames(args.video2, images_dir, args.fps, prefix="v2")
|
||||
total = n1 + n2
|
||||
|
||||
print(f"\n=== Total frames: {total} ===")
|
||||
if total > 800:
|
||||
print("WARNING: lots of frames. Consider lowering --fps. "
|
||||
"Exhaustive matching will be slow; switch to sequential_matcher if needed.")
|
||||
|
||||
# Run COLMAP
|
||||
print("\n=== Running COLMAP (this is the slow part, get a coffee) ===")
|
||||
run_colmap(workspace, images_dir, use_gpu=args.use_gpu)
|
||||
|
||||
# Print next steps
|
||||
print(f"""
|
||||
========================================================================
|
||||
DONE with SfM. Your scene is at: {workspace}
|
||||
|
||||
Next, train the splat. Two options on Mac:
|
||||
|
||||
OPTION 1: Nerfstudio (Python, scriptable)
|
||||
pip install nerfstudio
|
||||
ns-process-data images --data {images_dir} --output-dir {workspace}/ns_data \\
|
||||
--skip-colmap --colmap-model-path {workspace}/sparse/0
|
||||
ns-train splatfacto --data {workspace}/ns_data
|
||||
|
||||
OPTION 2: Brush (Rust binary, faster on Mac)
|
||||
Download from https://github.com/ArthurBrussee/brush
|
||||
brush --source {workspace}
|
||||
|
||||
View result:
|
||||
- Online: https://playcanvas.com/supersplat/editor (drag .ply in)
|
||||
- Local: install SuperSplat or use Nerfstudio's viewer
|
||||
========================================================================
|
||||
""")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
131
match_crossvideo.py
Normal file
131
match_crossvideo.py
Normal file
@@ -0,0 +1,131 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Cross-video exhaustive matching for two-flight drone footage.
|
||||
|
||||
Matches every v1_* frame against every v2_* frame. The within-video
|
||||
matches from match_features.py are already in the database and are not
|
||||
touched. After this script, re-run colmap mapper to stitch the scene.
|
||||
"""
|
||||
import argparse
|
||||
import sqlite3
|
||||
import numpy as np
|
||||
import cv2
|
||||
from pathlib import Path
|
||||
|
||||
MIN_INLIERS = 15
|
||||
RATIO_TEST = 0.75
|
||||
RANSAC_ERROR = 4.0
|
||||
KMAX = 2_147_483_647
|
||||
|
||||
|
||||
def pair_id(id1: int, id2: int) -> int:
|
||||
lo, hi = (id1, id2) if id1 < id2 else (id2, id1)
|
||||
return KMAX * lo + hi
|
||||
|
||||
|
||||
def load_desc_kpts(cur, image_id):
|
||||
cur.execute("SELECT rows, cols, data FROM descriptors WHERE image_id=?", (image_id,))
|
||||
r = cur.fetchone()
|
||||
desc = np.frombuffer(r[2], dtype=np.uint8).reshape(r[0], r[1]) if r else np.zeros((0,128), dtype=np.uint8)
|
||||
|
||||
cur.execute("SELECT rows, cols, data FROM keypoints WHERE image_id=?", (image_id,))
|
||||
r = cur.fetchone()
|
||||
if r:
|
||||
kp = np.frombuffer(r[2], dtype=np.float32).reshape(r[0], r[1])
|
||||
kpts = kp[:, :2]
|
||||
else:
|
||||
kpts = np.zeros((0, 2), dtype=np.float32)
|
||||
return desc, kpts
|
||||
|
||||
|
||||
def match_pair(desc1, desc2, kp1, kp2):
|
||||
if len(desc1) < 8 or len(desc2) < 8:
|
||||
return None, None
|
||||
bf = cv2.BFMatcher(cv2.NORM_L2)
|
||||
raw = bf.knnMatch(desc1.astype(np.float32), desc2.astype(np.float32), k=2)
|
||||
good = []
|
||||
for m_pair in raw:
|
||||
if len(m_pair) == 2:
|
||||
m, n = m_pair
|
||||
if m.distance < RATIO_TEST * n.distance:
|
||||
good.append((m.queryIdx, m.trainIdx))
|
||||
if len(good) < MIN_INLIERS:
|
||||
return None, None
|
||||
arr = np.array(good, dtype=np.uint32)
|
||||
pts1 = kp1[arr[:, 0]]
|
||||
pts2 = kp2[arr[:, 1]]
|
||||
F, mask = cv2.findFundamentalMat(
|
||||
pts1, pts2, cv2.FM_RANSAC,
|
||||
ransacReprojThreshold=RANSAC_ERROR,
|
||||
confidence=0.9999, maxIters=2000,
|
||||
)
|
||||
if F is None or mask is None:
|
||||
return None, None
|
||||
inliers = arr[mask.ravel().astype(bool)]
|
||||
return (inliers, F) if len(inliers) >= MIN_INLIERS else (None, None)
|
||||
|
||||
|
||||
def write_pair(cur, pid, inliers, F):
|
||||
blob = inliers.astype(np.uint32).tobytes()
|
||||
z9 = np.zeros(9, dtype=np.float64).tobytes()
|
||||
z4 = np.zeros(4, dtype=np.float64).tobytes()
|
||||
z3 = np.zeros(3, dtype=np.float64).tobytes()
|
||||
F_blob = F.flatten().astype(np.float64).tobytes()
|
||||
cur.execute(
|
||||
"INSERT OR REPLACE INTO matches (pair_id, rows, cols, data) VALUES (?,?,?,?)",
|
||||
(pid, len(inliers), 2, blob),
|
||||
)
|
||||
cur.execute(
|
||||
"INSERT OR REPLACE INTO two_view_geometries "
|
||||
"(pair_id, rows, cols, data, config, F, E, H, qvec, tvec) VALUES (?,?,?,?,?,?,?,?,?,?)",
|
||||
(pid, len(inliers), 2, blob, 3, F_blob, z9, z9, z4, z3),
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
p = argparse.ArgumentParser()
|
||||
p.add_argument("--db", default="my_scene/database.db")
|
||||
args = p.parse_args()
|
||||
|
||||
db = sqlite3.connect(args.db)
|
||||
db.execute("PRAGMA journal_mode=WAL")
|
||||
cur = db.cursor()
|
||||
|
||||
cur.execute("SELECT image_id, name FROM images ORDER BY name")
|
||||
rows = cur.fetchall()
|
||||
v1 = [(id, name) for id, name in rows if name.startswith("v1_")]
|
||||
v2 = [(id, name) for id, name in rows if name.startswith("v2_")]
|
||||
total_pairs = len(v1) * len(v2)
|
||||
print(f"v1={len(v1)} frames v2={len(v2)} frames cross-pairs={total_pairs}")
|
||||
|
||||
# Preload all v1 and v2 descriptors into RAM
|
||||
print("Loading v1 descriptors…")
|
||||
v1_data = {id: load_desc_kpts(cur, id) for id, _ in v1}
|
||||
print("Loading v2 descriptors…")
|
||||
v2_data = {id: load_desc_kpts(cur, id) for id, _ in v2}
|
||||
|
||||
matched = skipped = i = 0
|
||||
for id1, _ in v1:
|
||||
desc1, kp1 = v1_data[id1]
|
||||
for id2, _ in v2:
|
||||
desc2, kp2 = v2_data[id2]
|
||||
inliers, F = match_pair(desc1, desc2, kp1, kp2)
|
||||
if inliers is not None:
|
||||
write_pair(cur, pair_id(id1, id2), inliers, F)
|
||||
matched += 1
|
||||
else:
|
||||
skipped += 1
|
||||
i += 1
|
||||
if i % 500 == 0:
|
||||
pct = 100 * i / total_pairs
|
||||
print(f" [{i}/{total_pairs} {pct:.0f}%] cross-matched={matched}", flush=True)
|
||||
db.commit()
|
||||
|
||||
db.commit()
|
||||
db.close()
|
||||
print(f"\nDone. {matched} cross-video pairs matched, {skipped} below threshold.")
|
||||
print("Now delete my_scene/sparse/* and re-run colmap mapper.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
172
match_features.py
Normal file
172
match_features.py
Normal file
@@ -0,0 +1,172 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Python replacement for COLMAP's crashing exhaustive_matcher on Apple Silicon.
|
||||
|
||||
Reads SIFT features from the COLMAP SQLite database, matches them with
|
||||
OpenCV BFMatcher (Lowe ratio test), verifies with RANSAC, and writes
|
||||
matches + two_view_geometries back to the database.
|
||||
|
||||
COLMAP's mapper reads two_view_geometries — no need to re-run any COLMAP
|
||||
matcher binary after this script.
|
||||
"""
|
||||
import argparse
|
||||
import sqlite3
|
||||
import numpy as np
|
||||
import cv2
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
MIN_INLIERS = 15 # reject pairs with fewer verified matches
|
||||
RATIO_TEST = 0.75 # Lowe's ratio threshold
|
||||
RANSAC_ERROR = 4.0 # max reprojection error in pixels for RANSAC
|
||||
|
||||
# COLMAP 4.x pair_id formula: kMaxNumImages * min(id1,id2) + max(id1,id2)
|
||||
KMAX = 2_147_483_647
|
||||
|
||||
def pair_id(id1: int, id2: int) -> int:
|
||||
lo, hi = (id1, id2) if id1 < id2 else (id2, id1)
|
||||
return KMAX * lo + hi
|
||||
|
||||
|
||||
def read_images(cur):
|
||||
cur.execute("SELECT image_id, name FROM images ORDER BY name")
|
||||
return cur.fetchall() # [(image_id, name), ...]
|
||||
|
||||
|
||||
def load_all(cur, image_ids):
|
||||
descs, kpts = {}, {}
|
||||
for iid in image_ids:
|
||||
cur.execute("SELECT rows, cols, data FROM descriptors WHERE image_id=?", (iid,))
|
||||
r = cur.fetchone()
|
||||
if r:
|
||||
descs[iid] = np.frombuffer(r[2], dtype=np.uint8).reshape(r[0], r[1])
|
||||
else:
|
||||
descs[iid] = np.zeros((0, 128), dtype=np.uint8)
|
||||
|
||||
cur.execute("SELECT rows, cols, data FROM keypoints WHERE image_id=?", (iid,))
|
||||
r = cur.fetchone()
|
||||
if r:
|
||||
kp = np.frombuffer(r[2], dtype=np.float32).reshape(r[0], r[1])
|
||||
kpts[iid] = kp[:, :2] # x, y
|
||||
else:
|
||||
kpts[iid] = np.zeros((0, 2), dtype=np.float32)
|
||||
return descs, kpts
|
||||
|
||||
|
||||
def match_pair(desc1, desc2, kp1, kp2):
|
||||
if len(desc1) < 8 or len(desc2) < 8:
|
||||
return None, None
|
||||
|
||||
bf = cv2.BFMatcher(cv2.NORM_L2)
|
||||
raw = bf.knnMatch(desc1.astype(np.float32), desc2.astype(np.float32), k=2)
|
||||
|
||||
good = []
|
||||
for m_pair in raw:
|
||||
if len(m_pair) == 2:
|
||||
m, n = m_pair
|
||||
if m.distance < RATIO_TEST * n.distance:
|
||||
good.append((m.queryIdx, m.trainIdx))
|
||||
|
||||
if len(good) < MIN_INLIERS:
|
||||
return None, None
|
||||
|
||||
arr = np.array(good, dtype=np.uint32)
|
||||
pts1 = kp1[arr[:, 0]]
|
||||
pts2 = kp2[arr[:, 1]]
|
||||
|
||||
F, mask = cv2.findFundamentalMat(
|
||||
pts1, pts2, cv2.FM_RANSAC,
|
||||
ransacReprojThreshold=RANSAC_ERROR,
|
||||
confidence=0.9999,
|
||||
maxIters=2000,
|
||||
)
|
||||
if F is None or mask is None:
|
||||
return None, None
|
||||
|
||||
inliers = arr[mask.ravel().astype(bool)]
|
||||
if len(inliers) < MIN_INLIERS:
|
||||
return None, None
|
||||
|
||||
return inliers, F
|
||||
|
||||
|
||||
def write_pair(cur, pid, inliers, F):
|
||||
blob = inliers.astype(np.uint32).tobytes()
|
||||
zeros9 = np.zeros(9, dtype=np.float64).tobytes()
|
||||
zeros4 = np.zeros(4, dtype=np.float64).tobytes()
|
||||
zeros3 = np.zeros(3, dtype=np.float64).tobytes()
|
||||
F_blob = F.flatten().astype(np.float64).tobytes()
|
||||
|
||||
cur.execute(
|
||||
"INSERT OR REPLACE INTO matches (pair_id, rows, cols, data) VALUES (?,?,?,?)",
|
||||
(pid, len(inliers), 2, blob),
|
||||
)
|
||||
cur.execute(
|
||||
"INSERT OR REPLACE INTO two_view_geometries "
|
||||
"(pair_id, rows, cols, data, config, F, E, H, qvec, tvec) "
|
||||
"VALUES (?,?,?,?,?,?,?,?,?,?)",
|
||||
(pid, len(inliers), 2, blob,
|
||||
3, # UNCALIBRATED — uses F matrix
|
||||
F_blob, zeros9, zeros9, zeros4, zeros3),
|
||||
)
|
||||
|
||||
|
||||
def sequential_pairs(ids, overlap):
|
||||
pairs = []
|
||||
n = len(ids)
|
||||
for i in range(n):
|
||||
for j in range(i + 1, min(i + overlap + 1, n)):
|
||||
pairs.append((ids[i], ids[j]))
|
||||
return pairs
|
||||
|
||||
|
||||
def main():
|
||||
p = argparse.ArgumentParser()
|
||||
p.add_argument("--db", default="my_scene/database.db")
|
||||
p.add_argument("--overlap", type=int, default=50)
|
||||
args = p.parse_args()
|
||||
|
||||
db_path = args.db
|
||||
db = sqlite3.connect(db_path)
|
||||
db.execute("PRAGMA journal_mode=WAL")
|
||||
cur = db.cursor()
|
||||
|
||||
images = read_images(cur)
|
||||
ids = [r[0] for r in images]
|
||||
print(f"Images: {len(ids)}")
|
||||
|
||||
print("Loading descriptors & keypoints into memory…")
|
||||
descs, kpts = load_all(cur, ids)
|
||||
total_feats = sum(len(d) for d in descs.values())
|
||||
print(f"Loaded {total_feats:,} keypoints total")
|
||||
|
||||
overlap = args.overlap
|
||||
pairs = sequential_pairs(ids, overlap)
|
||||
print(f"Pairs to match: {len(pairs)} (sequential overlap={overlap})")
|
||||
|
||||
matched = skipped = 0
|
||||
for i, (id1, id2) in enumerate(pairs):
|
||||
if i % 200 == 0:
|
||||
pct = 100 * i / len(pairs)
|
||||
print(f" [{i}/{len(pairs)} {pct:.0f}%] matched={matched}", flush=True)
|
||||
|
||||
inliers, F = match_pair(descs[id1], descs[id2], kpts[id1], kpts[id2])
|
||||
if inliers is not None:
|
||||
write_pair(cur, pair_id(id1, id2), inliers, F)
|
||||
matched += 1
|
||||
else:
|
||||
skipped += 1
|
||||
|
||||
if i % 500 == 0:
|
||||
db.commit()
|
||||
|
||||
db.commit()
|
||||
db.close()
|
||||
|
||||
print(f"\nDone. {matched} pairs matched, {skipped} below threshold.")
|
||||
print(f"Now run: colmap mapper --database_path {db_path} "
|
||||
f"--image_path my_scene/images --output_path my_scene/sparse")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
5
run.sh
Executable file
5
run.sh
Executable file
@@ -0,0 +1,5 @@
|
||||
python main.py \
|
||||
--video1 1.mp4 \
|
||||
--video2 2.mp4 \
|
||||
--output_dir ./my_scene \
|
||||
--fps 2
|
||||
120
train_splat.sh
Executable file
120
train_splat.sh
Executable file
@@ -0,0 +1,120 @@
|
||||
#!/usr/bin/env bash
|
||||
# Re-runnable pipeline: COLMAP output → Nerfstudio → splatfacto → .ply
|
||||
# Skips COLMAP (assumes my_scene/sparse/0/ already exists).
|
||||
set -euo pipefail
|
||||
cd "$(dirname "$0")"
|
||||
|
||||
source venv/bin/activate
|
||||
# Use the full Homebrew ffmpeg (nerfstudio's bundled one lacks split/fps filters)
|
||||
export PATH="/opt/homebrew/opt/ffmpeg/bin:$PATH"
|
||||
|
||||
SCENE=my_scene
|
||||
NS_DATA=$SCENE/ns_data
|
||||
EXPORT_DIR=$SCENE/exports
|
||||
PLY=$EXPORT_DIR/splat.ply
|
||||
|
||||
# ── 1. Verify COLMAP output ────────────────────────────────────────────────
|
||||
echo ""
|
||||
echo "=== Step 1: Verifying COLMAP output ==="
|
||||
|
||||
if [ ! -d "$SCENE/sparse" ] || [ -z "$(ls -A $SCENE/sparse 2>/dev/null)" ]; then
|
||||
echo "ERROR: $SCENE/sparse/ not found or empty. Run main.py + match_crossvideo.py first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Pick the model with the most registered images
|
||||
BEST_MODEL=$(python3 -c "
|
||||
import struct, os, sys
|
||||
best_dir, best_imgs = '', 0
|
||||
for m in sorted(os.listdir('$SCENE/sparse')):
|
||||
d = '$SCENE/sparse/' + m
|
||||
f = d + '/images.bin'
|
||||
if not os.path.isfile(f): continue
|
||||
with open(f,'rb') as fh: n = struct.unpack('<Q', fh.read(8))[0]
|
||||
if n > best_imgs: best_imgs, best_dir = n, d
|
||||
print(best_dir)
|
||||
")
|
||||
|
||||
if [ -z "$BEST_MODEL" ]; then
|
||||
echo "ERROR: no valid COLMAP model found in $SCENE/sparse/"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
for f in cameras.bin images.bin points3D.bin; do
|
||||
if [ ! -f "$BEST_MODEL/$f" ]; then
|
||||
echo "ERROR: missing $BEST_MODEL/$f"
|
||||
exit 1
|
||||
fi
|
||||
done
|
||||
|
||||
NUM_IMGS=$(python3 -c "import struct; f=open('$BEST_MODEL/images.bin','rb'); print(struct.unpack('<Q',f.read(8))[0])")
|
||||
NUM_PTS=$(python3 -c "import struct; f=open('$BEST_MODEL/points3D.bin','rb'); print(struct.unpack('<Q',f.read(8))[0])")
|
||||
echo " Best model: $BEST_MODEL (images=$NUM_IMGS points3D=$NUM_PTS)"
|
||||
|
||||
num_models=$(find "$SCENE/sparse" -mindepth 1 -maxdepth 1 -type d | wc -l | tr -d ' ')
|
||||
if [ "$num_models" -gt 1 ]; then
|
||||
echo " WARNING: $num_models disconnected models — using largest ($BEST_MODEL)."
|
||||
echo " Run match_crossvideo.py and re-map to attempt a full stitch."
|
||||
fi
|
||||
|
||||
# ── 2. Convert COLMAP → Nerfstudio format ─────────────────────────────────
|
||||
echo ""
|
||||
echo "=== Step 2: ns-process-data (COLMAP → Nerfstudio) ==="
|
||||
ns-process-data images \
|
||||
--data "$(pwd)/$SCENE/images" \
|
||||
--output-dir "$(pwd)/$NS_DATA" \
|
||||
--skip-colmap \
|
||||
--colmap-model-path "$(pwd)/$BEST_MODEL"
|
||||
|
||||
# ── 3. Train splatfacto with browser viewer ────────────────────────────────
|
||||
echo ""
|
||||
echo "=== Step 3: Training splatfacto ==="
|
||||
echo ""
|
||||
echo " ┌──────────────────────────────────────────────────────┐"
|
||||
echo " │ Live viewer (fly around during training): │"
|
||||
echo " │ http://localhost:7007 │"
|
||||
echo " └──────────────────────────────────────────────────────┘"
|
||||
echo ""
|
||||
ns-train splatfacto \
|
||||
--data "$NS_DATA" \
|
||||
--vis viewer \
|
||||
--viewer.quit-on-train-completion True
|
||||
|
||||
# ── 4. Find the latest training output ────────────────────────────────────
|
||||
TRAIN_OUT=$(ls -td outputs/*/splatfacto/*/ 2>/dev/null | head -1)
|
||||
if [ -z "$TRAIN_OUT" ]; then
|
||||
echo "ERROR: could not find training output folder under outputs/"
|
||||
exit 1
|
||||
fi
|
||||
CONFIG_PATH="$TRAIN_OUT/config.yml"
|
||||
echo " Training output: $TRAIN_OUT"
|
||||
|
||||
# ── 5. Export .ply ────────────────────────────────────────────────────────
|
||||
echo ""
|
||||
echo "=== Step 4: Exporting Gaussian splat to .ply ==="
|
||||
mkdir -p "$EXPORT_DIR"
|
||||
ns-export gaussian-splat \
|
||||
--load-config "$CONFIG_PATH" \
|
||||
--output-dir "$EXPORT_DIR"
|
||||
|
||||
# ── 6. Final summary ──────────────────────────────────────────────────────
|
||||
echo ""
|
||||
echo "======================================================================"
|
||||
if [ -f "$PLY" ]; then
|
||||
echo " .ply exported: $(pwd)/$PLY"
|
||||
else
|
||||
echo " WARNING: splat.ply not found at $PLY — check $EXPORT_DIR/"
|
||||
fi
|
||||
echo ""
|
||||
echo " View during training : http://localhost:7007"
|
||||
echo ""
|
||||
echo " View final .ply (Option A — recommended):"
|
||||
echo " Drag $(pwd)/$PLY into:"
|
||||
echo " https://playcanvas.com/supersplat/editor"
|
||||
echo " Runs 100% in-browser; the file stays on your machine."
|
||||
echo ""
|
||||
echo " View final .ply (Option B — fully offline):"
|
||||
echo " python3 -m http.server 8080 --directory \$(dirname $PLY)"
|
||||
echo " Then open http://localhost:8080/splat.ply in gsplat viewer"
|
||||
echo " (requires a separate gsplat.js page — Option A is simpler)"
|
||||
echo "======================================================================"
|
||||
Reference in New Issue
Block a user