Multi-View Fusion and Surface Reconstruction¶
This page explains the final stage of the AquaMVS pipeline: fusing depth maps from multiple cameras into a unified point cloud, filtering geometric inconsistencies, and reconstructing surface meshes.
Multi-View Depth Fusion¶
Each camera produces an independent depth map (see Dense Stereo Matching). These depth maps overlap spatially but may contain inconsistencies due to occlusions, noise, or matching errors. Fusion combines multi-view information to produce a single, cleaned point cloud.
- Problem Statement
Given \(N\) cameras with depth maps \(\{D_i(u, v)\}_{i=1}^N\) and confidence maps \(\{C_i(u, v)\}_{i=1}^N\), produce a fused point cloud \(\mathcal{P} = \{\mathbf{p}_j\}_{j=1}^M\) where each point is geometrically consistent across multiple views.
- Geometric Consistency Filtering
For each pixel in reference camera \(R\) with depth \(d_R(u, v)\):
Back-project to 3D: Compute the 3D point using the refractive ray model (see Refractive Geometry):
\[\mathbf{p} = \mathbf{O}_R + d_R(u, v) \, \mathbf{d}_R(u, v)\]where \(\mathbf{O}_R\) is the ray origin and \(\mathbf{d}_R\) is the ray direction for pixel \((u, v)\).
Project into source cameras: For each source camera \(S_j\), project \(\mathbf{p}\) to get pixel \((u_j, v_j)\).
Compare depths: Retrieve the depth \(d_j(u_j, v_j)\) from the source depth map. Back-project the source pixel to 3D point \(\mathbf{p}_j\).
Compute 3D distance:
\[\Delta_j = \|\mathbf{p} - \mathbf{p}_j\|\]Consistency check: Mark as consistent if:
\[\Delta_j < \tau_{\text{dist}}\]where \(\tau_{\text{dist}}\) is a distance threshold (e.g., 0.01 m = 1 cm).
Count consistent views: If the point is consistent with at least \(N_{\min}\) source cameras (e.g., \(N_{\min} = 2\)), retain it. Otherwise, discard as an outlier.
- Consistency Score
For visualization and analysis, each point is tagged with a consistency score:
\[\text{score}(\mathbf{p}) = \frac{\text{\# consistent views}}{\text{total \# source cameras}}\]High scores indicate strong multi-view agreement.
Fusion Pipeline Diagram
flowchart LR
DM1["Depth Map 1<br/>(Camera A)"] --> GC["Geometric<br/>Consistency<br/>Filter"]
DM2["Depth Map 2<br/>(Camera B)"] --> GC
DMN["Depth Map N<br/>(Camera C)"] --> GC
GC --> FPC["Fused<br/>Point Cloud"]
FPC --> OR["Outlier<br/>Removal"]
OR --> SR["Surface<br/>Reconstruction"]
style GC fill:#ffb74d,stroke:#f57c00,stroke-width:2px
style FPC fill:#81c784,stroke:#388e3c,stroke-width:2px
Point Cloud Generation¶
After geometric consistency filtering, valid depth pixels are converted to 3D points with color.
- Back-Projection
For each valid pixel \((u, v)\) in reference camera \(R\) with depth \(d(u, v)\):
\[\mathbf{p} = \mathbf{O}_R(u, v) + d(u, v) \, \mathbf{d}_R(u, v)\]where \((\mathbf{O}_R, \mathbf{d}_R)\) is the ray from the refractive projection model.
- Color Assignment
Point color is taken from the reference image at pixel \((u, v)\):
\[\mathbf{c} = I_R(u, v)\]For better color fidelity, colors can be averaged across consistent views (not currently implemented in AquaMVS).
- Statistical Outlier Removal
Even after consistency filtering, some outliers may remain. Statistical outlier removal cleans the point cloud:
For each point \(\mathbf{p}_i\), find its \(k\) nearest neighbors.
Compute mean distance to neighbors: \(\bar{d}_i\).
Compute global statistics across all points:
\[\mu = \text{mean}(\{\bar{d}_i\}), \quad \sigma = \text{std}(\{\bar{d}_i\})\]Remove outliers where:
\[\bar{d}_i > \mu + \lambda \sigma\](e.g., \(k = 20\), \(\lambda = 2.0\)).
This filters isolated points far from the main surface.
- Merging Overlapping Regions
When multiple cameras view the same region, their point clouds overlap. AquaMVS currently retains all points (no explicit deduplication). For very dense clouds, voxel downsampling can reduce redundancy:
\[\text{voxel\_size} = 0.001 \text{ m (1 mm)}\]
Surface Reconstruction¶
Point clouds are useful but often a continuous surface mesh is desired for visualization, physics simulation, or further processing. AquaMVS supports three surface reconstruction methods.
Poisson Surface Reconstruction¶
- Overview
Poisson reconstruction solves for a smooth, watertight surface that best fits an oriented point cloud (points + normals). It’s robust and produces high-quality meshes but may hallucinate geometry in regions without point coverage.
- Algorithm
Given point cloud \(\mathcal{P} = \{\mathbf{p}_i, \mathbf{n}_i\}_{i=1}^N\) (points and normals):
Indicator Function: Compute a volumetric indicator function \(\chi(\mathbf{x})\) that is 1 inside the surface and 0 outside. The gradient of \(\chi\) aligns with the normals:
\[\nabla \chi(\mathbf{x}) \approx \mathbf{n}_i \text{ near } \mathbf{p}_i\]Poisson Equation: Solve the Poisson equation:
\[\Delta \chi = \nabla \cdot \mathbf{V}\]where \(\mathbf{V}\) is a vector field constructed from point normals.
Isosurface Extraction: Extract the \(\chi = 0.5\) isosurface using marching cubes to get the triangle mesh.
- Parameters
Depth: Octree depth controlling mesh resolution (default: 9). Higher depth → finer mesh but slower computation.
Density Filtering: Poisson fills gaps, so low-density regions are trimmed using a density percentile threshold (e.g., 1st percentile).
- Pros
Smooth, watertight mesh
Handles noise well
Good for general-purpose reconstruction
- Cons
May fill holes with hallucinated geometry
Requires normal estimation (done automatically)
- Use Case
Best for smooth surfaces (e.g., sand beds, rock surfaces) where gaps should be interpolated.
Height-Field Reconstruction¶
- Overview
Height-field reconstruction projects the point cloud onto a regular XY grid and interpolates Z values. Assumes the surface is roughly planar and single-valued in Z (no overhangs).
- Algorithm
Grid Definition: Create a regular 2D grid in the XY plane:
\[x_i \in [x_{\min}, x_{\max}], \quad y_j \in [y_{\min}, y_{\max}]\]with resolution \(\Delta\) (e.g., 5 mm).
Interpolation: For each grid cell \((x_i, y_j)\), interpolate Z value from nearby points using linear interpolation:
\[z_{ij} = \text{interp}\left(\{(x_k, y_k) \to z_k\}_{k=1}^N, (x_i, y_j)\right)\](implemented via
scipy.interpolate.griddata).Triangulation: Connect neighboring grid cells with triangles to form a mesh. Each \(2 \times 2\) grid cell produces 2 triangles.
Color Interpolation: Interpolate point colors onto grid vertices similarly.
- Parameters
Grid Resolution: Grid spacing in meters (default: 0.005 m = 5 mm).
- Pros
Fast
Preserves fine detail
Simple and predictable
- Cons
Cannot represent overhangs or vertical surfaces
Assumes planar geometry
Gaps in coverage → holes in mesh
- Use Case
Best for approximately planar surfaces viewed from above (e.g., water surface, sand bed in overhead view).
Ball Pivoting Algorithm (BPA)¶
- Overview
BPA “rolls” a virtual ball over the point cloud, creating triangles where the ball touches three points. Preserves detail and requires no normal orientation but is sensitive to sampling density.
- Algorithm
Seed Triangle: Find an initial triangle where a ball of radius \(r\) rests on three points without containing other points.
Pivot: For each triangle edge, pivot the ball around the edge until it touches a third point, forming a new triangle.
Grow: Repeat pivoting until no new triangles can be formed.
- Parameters
Radii: List of ball radii to try (e.g.,
[0.005, 0.01, 0.02]). Multiple radii handle varying point density.
- Pros
Preserves fine geometric detail
No normal estimation required
Non-watertight (doesn’t fill gaps)
- Cons
Sensitive to point density variations
May produce disconnected patches
Requires parameter tuning (ball radii)
- Use Case
Best for detailed surfaces with uniform point sampling where you want to avoid gap-filling (e.g., high-resolution scans of textured objects).
Mesh Simplification¶
High-resolution meshes can have millions of faces, which is impractical for visualization or downstream processing. Mesh simplification reduces face count while preserving shape.
- Quadric Error Decimation
AquaMVS uses quadric error metrics to iteratively collapse edges:
For each vertex, compute a quadric (a 4×4 matrix) representing the error of collapsing it to nearby positions.
Iteratively collapse the edge with minimum error.
Stop when target face count is reached.
- Parameters
Target Faces: Desired number of triangles (e.g., 100,000).
- Example
A 2M-face Poisson mesh can be simplified to 100k faces with minimal visual difference.
Mesh Export Formats¶
AquaMVS supports multiple mesh export formats:
PLY (Polygon File Format): Simple, widely supported. Stores vertices, faces, and colors. Binary or ASCII.
OBJ (Wavefront Object): Human-readable ASCII format. Stores geometry and can reference material files for textures.
STL (Stereolithography): Used for 3D printing. Stores only triangle geometry (no color). Requires normals (auto-computed).
GLTF (GL Transmission Format): Modern web-friendly format. Supports animation, materials, and embedded textures.
Connection to Code¶
The fusion and surface reconstruction algorithms are implemented in:
aquamvs.fusion.filter_depth_map(): Geometric consistency filtering.aquamvs.fusion.fuse_point_clouds(): Multi-view point cloud merging.aquamvs.surface.reconstruct_poisson(): Poisson surface reconstruction.aquamvs.surface.reconstruct_heightfield(): Height-field reconstruction.aquamvs.surface.reconstruct_bpa(): Ball pivoting algorithm.
For API details, see Reconstruction.
Summary¶
The fusion stage transforms per-camera depth maps into a unified 3D surface:
Geometric Consistency Filtering: Cross-view validation to remove outliers.
Point Cloud Generation: Back-projection with color assignment.
Outlier Removal: Statistical filtering to clean the point cloud.
Surface Reconstruction: Choose from Poisson (smooth), height-field (planar), or BPA (detailed).
Mesh Simplification: Reduce polygon count for practical use.
Export: Save as PLY, OBJ, STL, or GLTF.
This completes the AquaMVS reconstruction pipeline: from camera pixels, through refractive ray tracing (Refractive Geometry), dense stereo matching (Dense Stereo Matching), to final 3D surface meshes.