memo: Vis | Image Read/Write

Table of contents

White background

Code from 3DGS

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from pathlib import Path
from PIL import Image
image_path = Path(path) / cam_name  # '/home/yi/Downloads/nerf/data/nerf_synthetic/lego/./train/r_0.png'
image_name = Path(cam_name).stem    # r_0
image = Image.open(image_path)  # <class 'PIL.PngImagePlugin.PngImageFile'>

im_data = np.array(image.convert("RGBA"))   # (800,800,4)

bg = np.array([1,1,1]) if white_background else np.array([0, 0, 0])

# Normalize to [0,1]
norm_data = im_data / 255.0

# Re-composite
img_arr = norm_data[...,:3] * norm_data[...,-1:] + bg * (1-norm_data[...,-1:])

# Convert to image
img_rgb = Image.fromarray(np.array(img_arr * 255.0, dtype=np.byte), 'RGB')

Scaling image

Zooming image requires changing the focal lengths together, while cropping image doesn’t need.

  1. Downsize h and w, the focal also downscales, see NeRF:

    1
    2
    3
    4
    5
    6
    
    factor 4     # shrink raw image to 1/4
    args = ' '.join(['mogrify', '-resize', f'{100./factor}%', '-format', 'png', '*.{}'.format(ext)])
    ...
    # the 5th column is hwf
    poses[:2, 4, :] = np.array(sh[:2]).reshape([2, 1]) # hw
    poses[2, 4, :] = poses[2, 4, :] * 1./factor  # focal
    
    A b l e t f o s e e 4 b l k s . C l o s e r : o ½ n f l y 1 t i l e f i l l s i n e y e s
    • Example: CasMVSNet has 3 levels of feature maps, so the first 2 rows of the camera intrinsics are scaled up along with the image size increases:

      1
      2
      
      for l in reversed(range(self.levels)):
        intrinsics[:2] *= 2 # 1/4->1/2->1
      
  2. Crop a patch doesn’t affect focals referring to GNT


ToTensor

1
from torchvision import transforms

fov

Code from 3DGS

i m g f p l a n e l e f t z n _ e n a e r a r p l a n e r i g h t
  • Field of view: fovX = $2* arctan(\frac{width}{2f})$
  • Near plane’s right boundary: $z_{near} * tan(fovX)$
  1. Convert fov to focal

    1
    2
    
    def fov2focal(fovX, width):   # 1111.11103, 800
        return width / (2 * math.tan(fov / 2))
    
  2. Near plane computed from fov

    1
    2
    3
    4
    5
    6
    7
    
    tanHalfFovY = math.tan((fovY / 2))
    tanHalfFovX = math.tan((fovX / 2))
    
    top = tanHalfFovY * znear
    bottom = -top
    right = tanHalfFovX * znear
    left = -right
    

Pixel Coords

(2024-03-14)

  1. np.mgrid. Example from casmvsnet_pl

    1
    2
    3
    
    xy_ref = np.mgrid[:args.img_wh[1],:args.img_wh[0]][::-1] # (2, args.img_h, args.img_w)
    # restore depth for (x,y):
    xyz_ref = np.vstack((xy_ref, np.ones_like(xy_ref[:1]))) * depth_refined[ref_vid]    # (3:xyz, h,w)
    
  2. np.meshgrid. Example form MVSNet_pytorch

    1
    2
    3
    4
    5
    
    xx, yy = np.meshgrid(np.arange(0, width), np.arange(0, height))
    print("yy", yy.max(), yy.min())
    yy = yy.reshape([-1])
    xx = xx.reshape([-1])
    X = np.vstack((xx, yy, np.ones_like(xx)))
    
  3. torch.meshgrid. Example form MVSNet_pytorch

    1
    2
    3
    4
    5
    6
    
    y, x = torch.meshgrid([torch.arange(0, height),
                           torch.arange(0, width)])
    y, x = y.contiguous(), x.contiguous()
    y, x = y.view(height * width), x.view(height * width)
    xyz = torch.stack((x, y, torch.ones_like(x)))  # [3, H*W]
    xyz = torch.unsqueeze(xyz, 0).repeat(batch, 1, 1)  # [B, 3, H*W]
    
  4. torch.cartesian_prod referred by Docs

    1
    2
    3
    
    h, w = ref.shape[:2]
    vu = torch.cartesian_prod(torch.arange(h), torch.arange(w))
    uv = torch.flip(vu, [1])  # (hw,2), As x varies, y is fixed
    

Write Image

(2024-04-02)

  • Example of cv2 in casmvsnet_pl

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    
    import matplotlib.pyplot as plt
    from PIL import Image # pillow
    import cv2
    import numpy as np
    
    root_dir = "/mnt/data2_z/MVSNet_testing/dtu"
    img_path = f'{root_dir}/scan1/images/00000000.jpg'
    
    img = np.array(Image.open(img_path)) # RGB
    fig, ax = plt.subplots(1,2)
    ax[0].imshow(img)
    cv2.imwrite(f'1.png', img[:,:,::-1])  # save
    
    ax[1].imshow(cv2.imread(img_path)) # nd array, BGR
    
    img_read = cv2.imread(img_path)[:,:, ::-1] # RGB
    print((img_read == img).all())
    
Built with Hugo
Theme Stack designed by Jimmy