Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello, thank you for your excellent work. Please allow me to seek a bit of assistance. #14

Closed
ckh0715 opened this issue Oct 10, 2024 · 4 comments

Comments

@ckh0715
Copy link

ckh0715 commented Oct 10, 2024

Could your provided code output the specific normal values and correctly transform them based on the camera parameters? I believe your work could potentially benefit some downstream tasks.

@haodong2000
Copy link
Collaborator

Hi there, thanks so much for your interest!

However I am a little confused about your question. The normal maps you see are directly saved from the network output:

Lotus/infer.py

Lines 179 to 180 in 1650b63

output_npy = pred
output_color = Image.fromarray((output_npy * 255).astype(np.uint8))

, without any camera parameters.

What's your "correctly transform them based on the camera parameters" meaning? Lifting the normal map into 3D point clouds and then applying transformation?

Best,

@ckh0715
Copy link
Author

ckh0715 commented Oct 12, 2024

Hi there, thanks so much for your interest!

However I am a little confused about your question. The normal maps you see are directly saved from the network output:

Lotus/infer.py

Lines 179 to 180 in 1650b63

output_npy = pred
output_color = Image.fromarray((output_npy * 255).astype(np.uint8))

, without any camera parameters.
What's your "correctly transform them based on the camera parameters" meaning? Lifting the normal map into 3D point clouds and then applying transformation?

Best,

感谢您的回复,我的疑问是例如Colmap可以得到场景图像的相机标定,能否将您的方法得到的法线转换到对应的相机坐标系下。

@haodong2000
Copy link
Collaborator

haodong2000 commented Oct 13, 2024

Hi there, sorry for my late response.

For depth estimation

I've tried to project the predicted depth map into 3D with camera parameters (mainly intrinsics), and it seems quite well. Here are the teaser's 3D view:

screen_rec.mp4
screen_rec.mp4

For normal estimation

For normal maps, the model should predict the relative normal directions, thus, I believe that it can be transformed into world coordinates given the camera parameters (mainly extrinsics?). Here is a toy example with flipped input to demonstrate that the model predicts relative normal maps.

image

image

Thanks again for your attention!

@haodong2000 haodong2000 pinned this issue Oct 14, 2024
@haodong2000 haodong2000 unpinned this issue Oct 14, 2024
@ckh0715
Copy link
Author

ckh0715 commented Oct 14, 2024

Hi there, sorry for my late response.

For depth estimation

I've tried to project the predicted depth map into 3D with camera parameters (mainly intrinsics), and it seems quite well. Here are the teaser's 3D view:

screen_rec.mp4
screen_rec.mp4

For normal estimation

For normal maps, the model should predict the relative normal directions, thus, I believe that it can be transformed into world coordinates given the camera parameters (mainly extrinsics?). Here is a toy example with flipped input to demonstrate that the model predicts relative normal maps.

image

image

Thanks again for your attention!

Thank you for your response, and once again, thank you for doing such great work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants