Table of contents
Abstract
- Embed diffusion model into stereo matching network
- Adopt multi-level network for high-resolution input
- Fuse generated depth map to reconstruct 3D human model.
Introduction
-
Sparse-view methods, which predict geometry based on appearance, cannot produce detailed human model because of lacking sufficient multiview stereo matching.
-
Continuous models are basically obtained from traditional stereo methods based on a continuous varitional formulation, which can solved by diffusion model.
-
Pipeline:
- Reconstruct coarse field first by using DoubleField;
- Render depth maps from multiple viewpoints
- Compute disparity flow masks
- Refine disparity flow with diffusion model
- Level 1: Use CNN to extract feature maps of disparity flow masks
- Level 2: Condition diffusion model with feature maps
- Fuse 3D points through interpolation.