Sympo: Diffusion

Collections

References:

moatifbutt/awesome-diffusion-iclr-2025 - GitHub
- Surfaced when searching the paper of IC-Light in DDG

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

Code | ProjPage

Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model

ProjPage

(2023-12-26)
- DM for single-image depth estimation

HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion

【Diffusion生成NeRF】TUM, Apple提出HyperDiffusion，用Diffusion计算神经场权重，统一框架下生成3D权重或4D动画

(2023-07-16)
1. Use DM to generate a NeRF.
2. Comment: “思路不如 shapE 宽。shapE的encoder不仅把3d assets压缩为MLP，而且同时支持Nerf和DMTet的表征，在MLP上做diffusion还是conditional的。这篇文章相比起来还不太清楚卖点在哪”

MVDD: Multi-View Depth Diffusion Models

Arxiv | Emergent

(2023-12-31)
1. Use DM to generate multi-view depth maps for point cloud generation.
  - 20K+ points. The number of valid points may no larger than the resolution of an image, because depth and geometry consistencies needs to be checked like the point cloud fusion performed in MVSNet.
  - Depth map fusion
2. Epipolar attention affects the denosing steps.

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Code | Emergent

(2023-12-31) (可能是美貌与智慧并重他们做的，他在VAST?)
1. DM conditioned by a single image for generating multi-view images.
2. Restrict the frozen diffusion model with an epipolar cross-view attention
  - Reminds me MVDiffusion
3. Generate 16 multi-view images in 12 seconds
  - What is the resolution?
  - What is the device?
4. Adjusting feature maps to control image generation
  - No 3D geometry. I believe explicit structure is necessary for multi-view consistency especially in views with large-baselines.

RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D

Emergent

(2024-01-05)
1. generalizable Normal-Depth diffusion model,
2. PBR

Cameras as Rays: Pose Estimation via Ray Diffusion ~ ICLR 2024 (Oral)

ProjPage | Code | CMU

(2024-03-01)
- Generate ray moments and ray directions by diffusion model.

References:

Training Model From Scratch:

(2024-12-01)

IC-Light ^{r1-OpenReview}
- Related:
  1. Pdf: Lvmin Zhang
  2. lllyasviel/IC-Light
- Reasons:
  1. This paper draw my attention as it involves light transportation.
- Q&A:
  1. How does this method combine with Light Transport?
  2. Is the training process similar to NeRF, which integrated differentiable rendering into the “pipeline to fulfill the task”, i.e., volumetric rendering.
- Bonds:
  1. “in-the-wild data” reminds me NeRF-in-the-wild, which separates transient and consistant contents using two gates.
  2. “linear blending” of lighting effects under each single illumination condition.
    - Weighted sum, which the NN is good at.
    - I remember the word prompts to diffusion model have arithmatic characteristic, demonstrated in the short course of DLAI (Andrew Ng).
  3. Diffusion-baed illumination editing method
    - Lvmin commits himself to help artists ^r2-Paints.
- Ideas:
  1. Inproper training constraints result in a “Structure-guided random image generator”.
  2. Complex illumination > Mixture of illumination > Approximated with $k$ diffusion model.
- Questions:
  1. Can the Mixture of diffusion models be replaced with Gaussian mixture model?
    
    What are the similarity between the Mixture of diffusion models and Gaussian mixture model?