Table of contents
(Discussed in QQ group 706949479)
Notes
Abs
Previous NeRF didn’t utilize structural information on image level, but train and predict point-wise.
Method
Steps:
-
Apply SSIM on the randomly selected training pixel patch with a kernel size $K$ (=2) and stride size S (=K).
-
Repeatedly reorder the predicted and target pixel patchs, and calculate S3IM multiple ($M$=10) times.
-
The final loss term is the average of them multiplied with a weight factor (hyperparameter) $λ$.
$$ \rm L_{S3IM} = λ ⋅ (1 - \frac{1}{M} \sum_{m=1}^M SSIM(Patch_{rendered}, Patch_{target}) ) $$
Compare with SSIM:
-
S3IM applied on random pixel patches significantly outperforms SSIM applied on local continuous patches.
-
The authors explain this as the SSIM can only capture the local similarity, whereas S3IM can compare the nonlocal structural similarity over all training images.
-
Training NeRF with local continuous patches will hurt the performance (as stated at the end of section 3.1).
Play
|
|
|
|