# Drone-3DGS Two-flight DJI drone footage → 3D Gaussian Splatting pipeline for **Apple Silicon Macs** (M1/M2/M3). Takes two `.mp4` videos of the same scene from different angles, runs Structure-from-Motion via COLMAP, and produces a `.ply` Gaussian splat you can view in a browser. --- ## Requirements ```bash brew install colmap # COLMAP 4.x (SfM) brew install ffmpeg # full version with all filters python3 -m venv venv source venv/bin/activate pip install torch torchvision pip install nerfstudio ``` > **Python version**: 3.10 recommended (tested with 3.10.18 via pyenv). --- ## Project structure ``` . ├── 1.mp4 # first drone flight ├── 2.mp4 # second drone flight ├── main.py # Step 1 – extract frames + COLMAP feature extraction ├── match_features.py # Step 2 – within-video SIFT matching (Python, bypasses COLMAP crash) ├── match_crossvideo.py # Step 3 – cross-video exhaustive matching (v1×v2) ├── run.sh # Runs main.py (frame extraction + feature extraction) └── train_splat.sh # Steps 4–6: ns-process-data → splatfacto → export .ply ``` --- ## How to run ### Step 1 – Extract frames and COLMAP features ```bash source venv/bin/activate bash run.sh ``` This calls `main.py` which: 1. Extracts frames from `1.mp4` and `2.mp4` at 2 fps into `my_scene/images/` (named `v1_*.jpg` / `v2_*.jpg`) 2. Runs `colmap feature_extractor` — SIFT features written to `my_scene/database.db` 3. Runs `match_features.py` — sequential within-video matching (overlap=50) via OpenCV BFMatcher **Why not `colmap exhaustive_matcher`?** COLMAP 4.x has a threading bug on Apple Silicon ARM64 causing a SIGSEGV in all matcher variants. `match_features.py` replaces it entirely: reads SIFT descriptors from the SQLite database, matches with OpenCV BFMatcher + Lowe ratio test + RANSAC, and writes `two_view_geometries` back to the DB. The mapper only needs that table. ### Step 2 – Cross-video matching ```bash python3 match_crossvideo.py ``` Matches every `v1_*` frame against every `v2_*` frame (14,900 pairs) so the two flights stitch into a single model. Takes ~70 min on M1 Pro CPU (~0.28 s/pair with OpenCV BFMatcher). ### Step 3 – COLMAP mapper ```bash colmap mapper \ --database_path my_scene/database.db \ --image_path my_scene/images \ --output_path my_scene/sparse ``` Produces sparse models in `my_scene/sparse/`. The largest (most registered images) is the one to use. With overlapping flights you should get ~90–95% of frames in a single model. ### Steps 4–6 – Convert, train, export ```bash bash train_splat.sh ``` This script automatically: 1. Finds the largest COLMAP model in `my_scene/sparse/` 2. Converts it to Nerfstudio format with `ns-process-data` 3. Trains `splatfacto` — **live viewer at http://localhost:7007 during training** 4. Exports the Gaussian splat to `my_scene/exports/splat.ply` --- ## Known issues and fixes applied ### COLMAP 4.x matcher segfault (Apple Silicon) All COLMAP matcher variants (`exhaustive_matcher`, `sequential_matcher`, `vocab_tree_matcher`) crash with SIGSEGV on ARM64 due to a bug in the SIFT worker thread initialization. **Fix:** `match_features.py` and `match_crossvideo.py` replace the COLMAP matcher entirely using OpenCV. ### ffmpeg `fps` and `split` filters missing The nerfstudio-bundled ffmpeg is compiled with a minimal filter set. **Fixes:** - `main.py` uses `-r` output flag instead of `-vf fps=...` - `train_splat.sh` prepends `/opt/homebrew/opt/ffmpeg/bin` to `PATH` so `ns-process-data` uses the full Homebrew ffmpeg ### nerfstudio splatfacto hardcoded `.cuda()` calls Two lines in the installed `splatfacto.py` call `.cuda()` unconditionally. Patched in-place: | Location | Original | Fix | |----------|----------|-----| | `populate_modules()` | `shs = torch.zeros(...).float().cuda()` | `shs = torch.zeros(...).float()` | | `get_outputs_for_camera()` | `K = ....cuda()` | `K = ....to(self.device)` | If you reinstall nerfstudio, re-apply with: ```bash F=venv/lib/python3.10/site-packages/nerfstudio/models/splatfacto.py sed -i '' 's/\.float()\.cuda()/\.float()/g' "$F" sed -i '' 's/get_intrinsics_matrices()\.cuda()/get_intrinsics_matrices().to(self.device)/g' "$F" ``` --- ## 3DGS on Apple Silicon — current status `splatfacto` uses **gsplat** as its rasterizer. gsplat 1.x requires CUDA — there is no MPS or CPU fallback. On Apple Silicon the CUDA extension is `None` at load time and crashes at first use. **Two options for actual Gaussian Splatting:** ### Option A — Brush (recommended, uses Apple Metal natively) ```bash # Install Rust (one-time) brew install rustup && rustup-init -y && source ~/.cargo/env # Build and run cargo install --git https://github.com/ArthurBrussee/brush brush-cli brush-cli --source my_scene/sparse/3 ``` Outputs a `.ply` and has a built-in web viewer. ### Option B — Google Colab (free GPU) The scene is already in Nerfstudio format at `my_scene/ns_data/`. Zip it, upload to a Colab T4 instance: ```python !pip install nerfstudio !ns-train splatfacto --data /content/ns_data --vis wandb ``` Download `outputs/*/splatfacto/*/splat.ply` when done. --- ## Viewing results | What | How | |------|-----| | During splatfacto training | `http://localhost:7007` (Nerfstudio Viser viewer) | | Sparse point cloud (ready now) | Drag `my_scene/exports/sparse_pointcloud.ply` into https://playcanvas.com/supersplat/editor | | Final Gaussian splat | Drag `my_scene/exports/splat.ply` into https://playcanvas.com/supersplat/editor | PlayCanvas SuperSplat runs 100% in-browser — the file never leaves your machine. --- ## Re-running from scratch ```bash rm -rf my_scene outputs bash run.sh # frames + features (~10 min) python3 match_crossvideo.py # cross-video matching (~70 min) colmap mapper \ --database_path my_scene/database.db \ --image_path my_scene/images \ --output_path my_scene/sparse # mapping (~10 min) bash train_splat.sh # convert + train + export ```