Featured image of post test: NeRF | Make Datasets

test: NeRF | Make Datasets

Table of contents

(Feature image from Blender Artists Community - Donut 3.0)


NeRF Synthetic

The dataset nerf/data/nerf_synthetic/lego composes of 3 things:

  1. RGB images (train/),

  2. point cloud (points3d.ply),

  3. associated camera poses (transforms_train.json), where each frame contains 3 attributes: file_path, rotation, and c2w transform_matrix

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    
    {
      "camera_angle_x": 0.6911112070083618,
    
      "frames": [
          {
              "file_path": "./train/r_0",
              "rotation": 0.012566370614359171,
              "transform_matrix": [
                  [
                      -0.9999021887779236,
                      0.004192245192825794,
                      -0.013345719315111637,
                      -0.05379832163453102
                  ],
                  [
                      -0.013988681137561798,
                      -0.2996590733528137,
                      0.95394366979599,
                      3.845470428466797
                  ],
                  [
                      -4.656612873077393e-10,
                      0.9540371894836426,
                      0.29968830943107605,
                      1.2080823183059692
                  ],
                  [
                      0.0,
                      0.0,
                      0.0,
                      1.0
                  ]
              ]
          },
    

Colmap Sparse

(2024-03-14)


Blender Add-On

(2024-03-20)


NeRFStudio


DTU

(2024-03-21)

Adapt the DTU dataset to the NeRF synthetic style dataset format. Concretely, the extrinsics (w2c) in DTU (aligned with OpenCV) are required to be modified to c2w of OpenGL.

DTU dataset already provides point cloud (ply) and multi-view images.

(2024-04-02)

  • The extrinsics in DTU is w2c. So, c2w = np.linalg.inv(extrinsics)

  • And because the camera coordinate system in OpenCV (used by DTU) is RDF, the 4 columns in the c2w are [Right | Down | Front | CamCenter]

    However, in OpenGL, the c2w matrix should be: [Right | Up | Back | CamCenter]

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    Trsfm = np.array([[ 9.70263e-01, 7.47983e-03, 2.41939e-01, -1.91020e+02],
            [-1.47429e-02, 9.99493e-01, 2.82234e-02,  3.28832e+00],
            [-2.41605e-01,-3.09510e-02, 9.69881e-01,  2.25401e+01],
            [ 0.00000e+00, 0.00000e+00, 0.00000e+00,  1.00000e+00]])
    c2w_opencv = np.linalg.inv(Trsfm)
    print(c2w_opencv)
    c2w_opengl = c2w_opencv
    c2w_opengl[:,1] = - c2w_opencv[:,1]
    c2w_opengl[:,2] = - c2w_opencv[:,2]
    
  • The other way is adjusting w2c:

    Each row in the rotation matrix of w2c is a camera axes. And the 4-th column is $-R_{c2w}^{-1} t_{c2w} = - R_{w2c} C$, where the C is the camera center.

    To align with the OpenGL camera coord. system (from RDF to RUB), the 2nd and 3rd rows in w2c need flips:

    1
    2
    3
    4
    5
    
    w2c_opencv = Trsfm
    w2c_opencv[1] *=-1
    w2c_opencv[2] *=-1
    c2w_opengl_ = np.linalg.inv(w2c_opencv)
    print((c2w_opengl_ ==c2w_opengl).all())
    

    The c2w_opengl_ equals to the c2w_opengl above.

Built with Hugo
Theme Stack designed by Jimmy