Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the paper #58

Closed
zhoutianyang2002 opened this issue Aug 11, 2024 · 2 comments
Closed

Some questions about the paper #58

zhoutianyang2002 opened this issue Aug 11, 2024 · 2 comments

Comments

@zhoutianyang2002
Copy link

Hi!

Thank you for your excellent work!

I am a newbie of 3D Vision. May I ask some questions about the paper?

  1. I notice that we do not use the perceptual loss term(e.g. vgg perceptual loss) in the loss function unlike other 3DGS avatar papers. We only use L1 term and SSIM loss term. Is that because it is empirically effective or for other reasons?
  2. In this paper, we unproject all pixels of two views into 3D space to form 3D Gaussians. Will it result in the existence of many Gaussian positions in 3D space are very close(because they are corresponding points in 2D images), leading to duplication and reduced efficiency?
  3. In formula(6), it maybe does not like a matrix multiplication form. Maybe the indices are wrong? In other words, maybe the correct form is $$C_{i j k}=\sum_{h}\left(\mathbf{f}{l}^{S}\right){i j h} \cdot\left(\mathbf{f}{r}^{S}\right){i h k}$$, or $$C_{i j k}=\sum_{h}\left(\mathbf{f}{l}^{S}\right){i h k} \cdot\left(\mathbf{f}{r}^{S}\right){h j k}$$ , not $$C_{i j k}=\sum_{h}\left(\mathbf{f}{l}^{S}\right){i j h} \cdot\left(\mathbf{f}{r}^{S}\right){i k h}$$ in paper?

Sorry to bother you. Thank you very much!

@ShunyuanZheng
Copy link
Collaborator

Hi, thanks for your interest!

  1. We have tried to use LPIPS loss in the training of GPS-Gaussian but witnessed no significant improvement. Considering the additional memory usage, we do not use it in our pipeline. The loss term of L1+SSIM including the weights follows the setup in 3DGS.
  2. Yes, the Gaussians are very close and small in size compared to the original 3DGS. However, the number of Gaussian points does not significantly degrade the efficiency. As reported in our supplementary material, the rendering of around 300 thousand Gaussians takes around 0.8ms. The compression of GPS-Gaussian as discussed in about per-pixel gaussian allocation #54 (comment) worth an in-deep research.
  3. Eq6 borrows from RAFT-Stereo.

@zhoutianyang2002
Copy link
Author

2. Yes, the Gaussians are very close and small in size compared to the original 3DGS. However, the number of Gaussian points does not significantly degrade the efficiency. As reported in our supplementary material, the rendering of around 300 thousand Gaussians takes around 0.8ms. The compression of GPS-Gaussian as discussed in about per-pixel gaussian allocation #54 (comment) worth an in-deep research.

Thank you for your reply! Best wishes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants