watch: Jun Gao | ML for 3D content generation

The representation of 3D objects

Implicit field is in favor of neural network, where it can be optimized by gradient.
mesh can achieve real-time rendering and is handy for downstream creation, and good topology.
Marching cube is not fully differentiable

DMTet: A differentiable iso-surfacing is an implict field, and also a mesh.

2D GAN advantages:

GAN3D

The latent codes of geometry and texture are sampled from 3D gaussian as prior
3D generator: Tri-plane consistute the implicit field.
Get a mesh by DMTet from the generated geometry and texture, then render it to 2D image
Use GAN to discriminate if the render is real and backward the gradient of loss
Limitation: class label conditioned. One model can only can generate 1 category of objects.

2D diffusion used socre function to encourage high-fadality images
score function needs a full image, but NeRF are trained batch-by-batch of rays, not a full image.
Dream fusion can only render 64x64 images, so its geometry is low-quality.
Coarse to fine: Use instant-ngp generate a rough geometry based on low-resolution diffusion model， then use DMTet convert the geometry to mesh; So that a highe-resolution image can be rendered, which can offer a strong gradient for fine geometry