Files
Drone-3DGS/README.md
Jon 7f4cdd9459 Add two-video drone 3DGS pipeline with Apple Silicon fixes
- main.py: extract frames from two videos, run COLMAP feature extraction
- match_features.py: Python-based within-video SIFT matching via OpenCV
  (replaces colmap exhaustive_matcher which segfaults on ARM64 in COLMAP 4.x)
- match_crossvideo.py: exhaustive cross-video matching (v1×v2) to stitch
  two flights into a single COLMAP model
- run.sh: entry point for frame extraction + feature extraction
- train_splat.sh: ns-process-data → splatfacto → .ply export, with
  correct PATH for Homebrew ffmpeg and MPS device flags for Apple Silicon
- .gitignore: exclude videos, generated scene data, venv, logs
- README.md: full pipeline walkthrough, all known issues and fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 15:09:30 +01:00

176 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Drone-3DGS
Two-flight DJI drone footage → 3D Gaussian Splatting pipeline for **Apple Silicon Macs** (M1/M2/M3).
Takes two `.mp4` videos of the same scene from different angles, runs Structure-from-Motion via COLMAP, and produces a `.ply` Gaussian splat you can view in a browser.
---
## Requirements
```bash
brew install colmap # COLMAP 4.x (SfM)
brew install ffmpeg # full version with all filters
python3 -m venv venv
source venv/bin/activate
pip install torch torchvision
pip install nerfstudio
```
> **Python version**: 3.10 recommended (tested with 3.10.18 via pyenv).
---
## Project structure
```
.
├── 1.mp4 # first drone flight
├── 2.mp4 # second drone flight
├── main.py # Step 1 extract frames + COLMAP feature extraction
├── match_features.py # Step 2 within-video SIFT matching (Python, bypasses COLMAP crash)
├── match_crossvideo.py # Step 3 cross-video exhaustive matching (v1×v2)
├── run.sh # Runs main.py (frame extraction + feature extraction)
└── train_splat.sh # Steps 46: ns-process-data → splatfacto → export .ply
```
---
## How to run
### Step 1 Extract frames and COLMAP features
```bash
source venv/bin/activate
bash run.sh
```
This calls `main.py` which:
1. Extracts frames from `1.mp4` and `2.mp4` at 2 fps into `my_scene/images/` (named `v1_*.jpg` / `v2_*.jpg`)
2. Runs `colmap feature_extractor` — SIFT features written to `my_scene/database.db`
3. Runs `match_features.py` — sequential within-video matching (overlap=50) via OpenCV BFMatcher
**Why not `colmap exhaustive_matcher`?**
COLMAP 4.x has a threading bug on Apple Silicon ARM64 causing a SIGSEGV in all matcher variants. `match_features.py` replaces it entirely: reads SIFT descriptors from the SQLite database, matches with OpenCV BFMatcher + Lowe ratio test + RANSAC, and writes `two_view_geometries` back to the DB. The mapper only needs that table.
### Step 2 Cross-video matching
```bash
python3 match_crossvideo.py
```
Matches every `v1_*` frame against every `v2_*` frame (14,900 pairs) so the two flights stitch into a single model. Takes ~70 min on M1 Pro CPU (~0.28 s/pair with OpenCV BFMatcher).
### Step 3 COLMAP mapper
```bash
colmap mapper \
--database_path my_scene/database.db \
--image_path my_scene/images \
--output_path my_scene/sparse
```
Produces sparse models in `my_scene/sparse/`. The largest (most registered images) is the one to use. With overlapping flights you should get ~9095% of frames in a single model.
### Steps 46 Convert, train, export
```bash
bash train_splat.sh
```
This script automatically:
1. Finds the largest COLMAP model in `my_scene/sparse/`
2. Converts it to Nerfstudio format with `ns-process-data`
3. Trains `splatfacto`**live viewer at http://localhost:7007 during training**
4. Exports the Gaussian splat to `my_scene/exports/splat.ply`
---
## Known issues and fixes applied
### COLMAP 4.x matcher segfault (Apple Silicon)
All COLMAP matcher variants (`exhaustive_matcher`, `sequential_matcher`, `vocab_tree_matcher`) crash with SIGSEGV on ARM64 due to a bug in the SIFT worker thread initialization. **Fix:** `match_features.py` and `match_crossvideo.py` replace the COLMAP matcher entirely using OpenCV.
### ffmpeg `fps` and `split` filters missing
The nerfstudio-bundled ffmpeg is compiled with a minimal filter set. **Fixes:**
- `main.py` uses `-r` output flag instead of `-vf fps=...`
- `train_splat.sh` prepends `/opt/homebrew/opt/ffmpeg/bin` to `PATH` so `ns-process-data` uses the full Homebrew ffmpeg
### nerfstudio splatfacto hardcoded `.cuda()` calls
Two lines in the installed `splatfacto.py` call `.cuda()` unconditionally. Patched in-place:
| Location | Original | Fix |
|----------|----------|-----|
| `populate_modules()` | `shs = torch.zeros(...).float().cuda()` | `shs = torch.zeros(...).float()` |
| `get_outputs_for_camera()` | `K = ....cuda()` | `K = ....to(self.device)` |
If you reinstall nerfstudio, re-apply with:
```bash
F=venv/lib/python3.10/site-packages/nerfstudio/models/splatfacto.py
sed -i '' 's/\.float()\.cuda()/\.float()/g' "$F"
sed -i '' 's/get_intrinsics_matrices()\.cuda()/get_intrinsics_matrices().to(self.device)/g' "$F"
```
---
## 3DGS on Apple Silicon — current status
`splatfacto` uses **gsplat** as its rasterizer. gsplat 1.x requires CUDA — there is no MPS or CPU fallback. On Apple Silicon the CUDA extension is `None` at load time and crashes at first use.
**Two options for actual Gaussian Splatting:**
### Option A — Brush (recommended, uses Apple Metal natively)
```bash
# Install Rust (one-time)
brew install rustup && rustup-init -y && source ~/.cargo/env
# Build and run
cargo install --git https://github.com/ArthurBrussee/brush brush-cli
brush-cli --source my_scene/sparse/3
```
Outputs a `.ply` and has a built-in web viewer.
### Option B — Google Colab (free GPU)
The scene is already in Nerfstudio format at `my_scene/ns_data/`. Zip it, upload to a Colab T4 instance:
```python
!pip install nerfstudio
!ns-train splatfacto --data /content/ns_data --vis wandb
```
Download `outputs/*/splatfacto/*/splat.ply` when done.
---
## Viewing results
| What | How |
|------|-----|
| During splatfacto training | `http://localhost:7007` (Nerfstudio Viser viewer) |
| Sparse point cloud (ready now) | Drag `my_scene/exports/sparse_pointcloud.ply` into https://playcanvas.com/supersplat/editor |
| Final Gaussian splat | Drag `my_scene/exports/splat.ply` into https://playcanvas.com/supersplat/editor |
PlayCanvas SuperSplat runs 100% in-browser — the file never leaves your machine.
---
## Re-running from scratch
```bash
rm -rf my_scene outputs
bash run.sh # frames + features (~10 min)
python3 match_crossvideo.py # cross-video matching (~70 min)
colmap mapper \
--database_path my_scene/database.db \
--image_path my_scene/images \
--output_path my_scene/sparse # mapping (~10 min)
bash train_splat.sh # convert + train + export
```