Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EndoNeRF Dataset Cropping and Surgical Tool Artifacts in Rendering #17

Open
sandokim opened this issue Jan 22, 2025 · 2 comments
Open

Comments

@sandokim
Copy link

sandokim commented Jan 22, 2025

Hi, I'm Hyeseong!

I noticed that the EndoNeRF dataset is described as having dimensions of 512×640 in the ForPlane paper. However, in the actual code (e.g., video_datasets.py line 152), it looks like the height is being cropped to 500:

else:
    imgs = [i[:500, :, :] for i in imgs]
    masks = [i[:500, :, :] for i in masks]
    depths = [i[:500, :, :] for i in depths]
    intrinsics.height = 500  # this is a fix for the endo dataset
    intrinsics.center_y = intrinsics.center_y - 6

Can you explain the reason for setting the height to 500 instead of using the full 512 pixels? (I assume this is because there's a toolbar at the bottom of the frame, so it looks like it was cropped to exclude this.)

What's frustrating is that when I keep the original size of 512×640 without cropping, a surgical tool unexpectedly appears in the rendering output. Is this a known issue, and do you have any insight into why this happens or how to avoid it?

Image

Additionally, data trained with the Gaussian Splatting methods for EndoNeRF dataset is standardized to a 512x640 resolution for rendering. This is why I’m trying to keep the EndoNeRF dataset original size.

Looking forward to seeing your insight. Thanks!

@Loping151
Copy link
Owner

Yes the crop was to exclude the bar, and all the experiments (including those on the original EndoNeRF) in our paper applied this cropping. However I don't expect seeing the tools in the rendered image. With masks applied, no part of the tool could be learned by the planes. You can check your masked images (image_gt[i] * mask[i]) to see if you wrongly applied the masks.

@sandokim
Copy link
Author

sandokim commented Feb 5, 2025

I modified class VideoEndoDataset in datasets/video_datasets.py as follows:

        if 'hamlyn' in datadir:
            crop_size = 40 # a setting for hamlyn dataset   
            imgs = [i[:, crop_size:, :] for i in imgs]
            masks = [i[:, crop_size:, :] for i in masks]
            depths = [i[:, crop_size:, :] for i in depths]
            intrinsics.width = intrinsics.width - crop_size
            # note the crop will change the intrinsics, we need to change the intrinsics accordingly
            intrinsics.center_x = intrinsics.center_x - crop_size / 2
            # timestamps = torch.linspace(0, 1000, len(paths_img))
            # self.timestamps = (timestamps.float() / max(timestamps)) * 2 - 1
            ### NOTE: we split the dataset into train and test, in train set, we use half of the images, in test set, we use the other half
            if split == 'train':
                self.timestamps = self.timestamps[::2]
                imgs = imgs[::2]
                depths = depths[::2]
                masks = masks[::2]
            else:
                self.timestamps = self.timestamps[1::2]
                imgs = imgs[1::2]
                depths = depths[1::2]
                masks = masks[1::2]
        elif 'Stereo' in datadir:
            pass
        elif 'endonerf' in datadir:
            pass
        else:
            ValueError(f"Invalid dataset type!")
        # else: 
        #     imgs = [i[:500, :, :] for i in imgs]
        #     masks = [i[:500, :, :] for i in masks]
        #     depths = [i[:500, :, :] for i in depths]
        #     intrinsics.height = 500 # this is a fix for the endo dataset
        #     intrinsics.center_y = intrinsics.center_y - 6

It does not have to influence the multiplication of image and mask.

Thank you for your suggestion, I'll check image * mask multiplication part as follows and see if this can solve the above problem!

https://github.com/Loping151/ForPlane/blob/main/forplanes/datasets/video_datasets.py#L611
https://github.com/Loping151/ForPlane/blob/main/forplanes/datasets/video_datasets.py#L646

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants