Add two-video drone 3DGS pipeline with Apple Silicon fixes

- main.py: extract frames from two videos, run COLMAP feature extraction
- match_features.py: Python-based within-video SIFT matching via OpenCV
  (replaces colmap exhaustive_matcher which segfaults on ARM64 in COLMAP 4.x)
- match_crossvideo.py: exhaustive cross-video matching (v1×v2) to stitch
  two flights into a single COLMAP model
- run.sh: entry point for frame extraction + feature extraction
- train_splat.sh: ns-process-data → splatfacto → .ply export, with
  correct PATH for Homebrew ffmpeg and MPS device flags for Apple Silicon
- .gitignore: exclude videos, generated scene data, venv, logs
- README.md: full pipeline walkthrough, all known issues and fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Jon
2026-05-26 15:09:30 +01:00
parent e0db1edbc6
commit 7f4cdd9459
7 changed files with 782 additions and 0 deletions

175
README.md
View File

@@ -0,0 +1,175 @@
# Drone-3DGS
Two-flight DJI drone footage → 3D Gaussian Splatting pipeline for **Apple Silicon Macs** (M1/M2/M3).
Takes two `.mp4` videos of the same scene from different angles, runs Structure-from-Motion via COLMAP, and produces a `.ply` Gaussian splat you can view in a browser.
---
## Requirements
```bash
brew install colmap # COLMAP 4.x (SfM)
brew install ffmpeg # full version with all filters
python3 -m venv venv
source venv/bin/activate
pip install torch torchvision
pip install nerfstudio
```
> **Python version**: 3.10 recommended (tested with 3.10.18 via pyenv).
---
## Project structure
```
.
├── 1.mp4 # first drone flight
├── 2.mp4 # second drone flight
├── main.py # Step 1 extract frames + COLMAP feature extraction
├── match_features.py # Step 2 within-video SIFT matching (Python, bypasses COLMAP crash)
├── match_crossvideo.py # Step 3 cross-video exhaustive matching (v1×v2)
├── run.sh # Runs main.py (frame extraction + feature extraction)
└── train_splat.sh # Steps 46: ns-process-data → splatfacto → export .ply
```
---
## How to run
### Step 1 Extract frames and COLMAP features
```bash
source venv/bin/activate
bash run.sh
```
This calls `main.py` which:
1. Extracts frames from `1.mp4` and `2.mp4` at 2 fps into `my_scene/images/` (named `v1_*.jpg` / `v2_*.jpg`)
2. Runs `colmap feature_extractor` — SIFT features written to `my_scene/database.db`
3. Runs `match_features.py` — sequential within-video matching (overlap=50) via OpenCV BFMatcher
**Why not `colmap exhaustive_matcher`?**
COLMAP 4.x has a threading bug on Apple Silicon ARM64 causing a SIGSEGV in all matcher variants. `match_features.py` replaces it entirely: reads SIFT descriptors from the SQLite database, matches with OpenCV BFMatcher + Lowe ratio test + RANSAC, and writes `two_view_geometries` back to the DB. The mapper only needs that table.
### Step 2 Cross-video matching
```bash
python3 match_crossvideo.py
```
Matches every `v1_*` frame against every `v2_*` frame (14,900 pairs) so the two flights stitch into a single model. Takes ~70 min on M1 Pro CPU (~0.28 s/pair with OpenCV BFMatcher).
### Step 3 COLMAP mapper
```bash
colmap mapper \
--database_path my_scene/database.db \
--image_path my_scene/images \
--output_path my_scene/sparse
```
Produces sparse models in `my_scene/sparse/`. The largest (most registered images) is the one to use. With overlapping flights you should get ~9095% of frames in a single model.
### Steps 46 Convert, train, export
```bash
bash train_splat.sh
```
This script automatically:
1. Finds the largest COLMAP model in `my_scene/sparse/`
2. Converts it to Nerfstudio format with `ns-process-data`
3. Trains `splatfacto`**live viewer at http://localhost:7007 during training**
4. Exports the Gaussian splat to `my_scene/exports/splat.ply`
---
## Known issues and fixes applied
### COLMAP 4.x matcher segfault (Apple Silicon)
All COLMAP matcher variants (`exhaustive_matcher`, `sequential_matcher`, `vocab_tree_matcher`) crash with SIGSEGV on ARM64 due to a bug in the SIFT worker thread initialization. **Fix:** `match_features.py` and `match_crossvideo.py` replace the COLMAP matcher entirely using OpenCV.
### ffmpeg `fps` and `split` filters missing
The nerfstudio-bundled ffmpeg is compiled with a minimal filter set. **Fixes:**
- `main.py` uses `-r` output flag instead of `-vf fps=...`
- `train_splat.sh` prepends `/opt/homebrew/opt/ffmpeg/bin` to `PATH` so `ns-process-data` uses the full Homebrew ffmpeg
### nerfstudio splatfacto hardcoded `.cuda()` calls
Two lines in the installed `splatfacto.py` call `.cuda()` unconditionally. Patched in-place:
| Location | Original | Fix |
|----------|----------|-----|
| `populate_modules()` | `shs = torch.zeros(...).float().cuda()` | `shs = torch.zeros(...).float()` |
| `get_outputs_for_camera()` | `K = ....cuda()` | `K = ....to(self.device)` |
If you reinstall nerfstudio, re-apply with:
```bash
F=venv/lib/python3.10/site-packages/nerfstudio/models/splatfacto.py
sed -i '' 's/\.float()\.cuda()/\.float()/g' "$F"
sed -i '' 's/get_intrinsics_matrices()\.cuda()/get_intrinsics_matrices().to(self.device)/g' "$F"
```
---
## 3DGS on Apple Silicon — current status
`splatfacto` uses **gsplat** as its rasterizer. gsplat 1.x requires CUDA — there is no MPS or CPU fallback. On Apple Silicon the CUDA extension is `None` at load time and crashes at first use.
**Two options for actual Gaussian Splatting:**
### Option A — Brush (recommended, uses Apple Metal natively)
```bash
# Install Rust (one-time)
brew install rustup && rustup-init -y && source ~/.cargo/env
# Build and run
cargo install --git https://github.com/ArthurBrussee/brush brush-cli
brush-cli --source my_scene/sparse/3
```
Outputs a `.ply` and has a built-in web viewer.
### Option B — Google Colab (free GPU)
The scene is already in Nerfstudio format at `my_scene/ns_data/`. Zip it, upload to a Colab T4 instance:
```python
!pip install nerfstudio
!ns-train splatfacto --data /content/ns_data --vis wandb
```
Download `outputs/*/splatfacto/*/splat.ply` when done.
---
## Viewing results
| What | How |
|------|-----|
| During splatfacto training | `http://localhost:7007` (Nerfstudio Viser viewer) |
| Sparse point cloud (ready now) | Drag `my_scene/exports/sparse_pointcloud.ply` into https://playcanvas.com/supersplat/editor |
| Final Gaussian splat | Drag `my_scene/exports/splat.ply` into https://playcanvas.com/supersplat/editor |
PlayCanvas SuperSplat runs 100% in-browser — the file never leaves your machine.
---
## Re-running from scratch
```bash
rm -rf my_scene outputs
bash run.sh # frames + features (~10 min)
python3 match_crossvideo.py # cross-video matching (~70 min)
colmap mapper \
--database_path my_scene/database.db \
--image_path my_scene/images \
--output_path my_scene/sparse # mapping (~10 min)
bash train_splat.sh # convert + train + export
```