Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinate system of normal maps #40

Closed
waps101 opened this issue Jul 9, 2020 · 1 comment
Closed

Coordinate system of normal maps #40

waps101 opened this issue Jul 9, 2020 · 1 comment

Comments

@waps101
Copy link

waps101 commented Jul 9, 2020

In your synthetic dataset, could you confirm that the normal maps are in camera coordinates as opposed to world? If I compute a normal map from the depth maps using finite difference and the intrinsic camera parameters, I can't get something that looks close to the ground truth.

Also, I wonder if there is quite heavy quantisation in the normal/depth maps? I think they are stored with 16 bit depth - is that right?

@sniklaus
Copy link
Owner

sniklaus commented Jul 9, 2020

Thank you for bringing this up! Yes, the normal maps are in camera space and not world space. I am under the impression that world space normal maps would be meaningless without camera extrinsics, so I opted to extract/provide them in camera space.

I did the same as you and wrote a script to approximate the normal from the provided ground truth depth. The following GIF shows the input image next to the ground truth normal and the normal approximated from the ground truth depth. Looks good to me, but please correct me if there are any issues with it that I am not aware of.

normal

You can find the script below. The script inverts the Y axis of the normal map since the coordinate system of Unreal differs from what I am using for the approximation. Furthermore, it sets the normal map for the sky (depth greater than 50k) to a well defined value since one would otherwise get the normal of the sphere/box in which the virtual environment resides. As for approximating the normal from the depth, the script uses the cross product of vectors between neighboring points in 3D space. Note that the depth is smoothed with a Gaussian filter to reduce noise, this is a very simple approximation after all.

#!/usr/bin/env python

import cv2
import json
import math
import moviepy
import moviepy.editor
import numpy

##########################################################

npyImages = []

for intSample in range(1, 20):
	npyImage = numpy.ascontiguousarray(cv2.imread(filename=str(intSample).zfill(5) + '-bl-image.png', flags=-1).astype(numpy.float32) * (1.0 / 255.0))
	npyDepth = numpy.ascontiguousarray(cv2.imread(filename=str(intSample).zfill(5) + '-bl-depth.exr', flags=-1)[:, :, None].astype(numpy.float32))
	npyNormal = numpy.ascontiguousarray(cv2.imread(filename=str(intSample).zfill(5) + '-bl-normal.exr', flags=-1).astype(numpy.float32))
	npyNormal[:, :, 1:2] *= -1.0
	npyNormal[:, :, 0:1][npyDepth >= 50000.0] = 0.0
	npyNormal[:, :, 1:2][npyDepth >= 50000.0] = 0.0
	npyNormal[:, :, 2:3][npyDepth >= 50000.0] = -1.0
	npyNormal /= numpy.linalg.norm(npyNormal, 2, 2, True).repeat(3, 2)

	intWidth = npyImage.shape[1]
	intHeight = npyImage.shape[0]
	fltFov = json.loads(open(str(intSample).zfill(5) + '-meta.json', 'r').read())['fltFov']
	fltFocal = 0.5 * max(intWidth, intHeight) * math.tan(math.radians(90.0) - (0.5 * math.radians(fltFov)))

	npyPinholeX = numpy.linspace((-0.5 * intWidth) + 0.5, (0.5 * intWidth) - 0.5, intWidth).reshape(1, intWidth).repeat(intHeight, 0).astype(numpy.float32)[:, :, None] * (1.0 / fltFocal)
	npyPinholeY = numpy.linspace((-0.5 * intHeight) + 0.5, (0.5 * intHeight) - 0.5, intHeight).reshape(intHeight, 1).repeat(intWidth, 1).astype(numpy.float32)[:, :, None] * (1.0 / fltFocal)
	npyPinholeZ = numpy.ones([intHeight, intWidth, 1], numpy.float32)
	npyPoints = cv2.GaussianBlur(src=npyDepth, ksize=(3, 3), sigmaX=0.0, sigmaY=0.0)[:, :, None]
	npyPoints = numpy.concatenate([npyPinholeX * npyPoints, npyPinholeY * npyPoints, npyPinholeZ * npyPoints], 2)
	npyDiffX = numpy.pad(npyPoints, [(0, 0), (1, 0), (0, 0)], 'constant')
	npyDiffX = npyDiffX[:, 1:, :] - npyDiffX[:, :-1, :]
	npyDiffY = numpy.pad(npyPoints, [(1, 0), (0, 0), (0, 0)], 'constant')
	npyDiffY = npyDiffY[1:, :, :] - npyDiffY[:-1, :, :]
	npyApprox = numpy.cross(npyDiffY, npyDiffX, 2)
	npyApprox /= numpy.linalg.norm(npyApprox, 2, 2, True).repeat(3, 2)

	npyImages.append(cv2.resize(src=(numpy.concatenate([npyImage, (npyNormal + 1.0) * 0.5, (npyApprox + 1.0) * 0.5], 1) * 255.0).clip(0.0, 255.0).astype(numpy.uint8), dsize=None, fx=0.5, fy=0.5, interpolation=cv2.INTER_AREA))
# end

moviepy.editor.ImageSequenceClip(sequence=[npyImage[:, :, ::-1] for npyImage in npyImages], fps=5).write_gif('normal.gif')

Thanks again for bringing this up! Closing this issue for now, please let me know in case something is still unclear or there are any issues with my script. Thanks!

@sniklaus sniklaus closed this as completed Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants