Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to understand _rle2voxel function #101

Closed
JJY0710 opened this issue May 28, 2024 · 6 comments
Closed

How to understand _rle2voxel function #101

JJY0710 opened this issue May 28, 2024 · 6 comments

Comments

@JJY0710
Copy link

JJY0710 commented May 28, 2024

Hello, I have some misunderstandings about the truth processing of the nyu dataset. In this _rle2voxel function:

def _rle2voxel(rle, voxel_size=(240, 144, 240), rle_filename=""):
    seg_label = np.zeros(
        int(voxel_size[0] * voxel_size[1] * voxel_size[2]), dtype=np.uint8
    )  # segmentation label
    vox_idx = 0
    for idx in range(int(rle.shape[0] / 2)):
        check_val = rle[idx * 2]
        check_iter = rle[idx * 2 + 1]
        if check_val >= 37 and check_val != 255:  # 37 classes to 12 classes
            print("RLE {} check_val: {}".format(rle_filename, check_val))
            
        seg_label_val = (
            seg_class_map[check_val] if check_val != 255 else 255
        )  # 37 classes to 12 classes
        seg_label[vox_idx : vox_idx + check_iter] = np.matlib.repmat(
            seg_label_val, 1, check_iter
        )
        vox_idx = vox_idx + check_iter
    seg_label = seg_label.reshape(voxel_size)  # 3D array, size 240 x 144 x 240

Firstly, why index rle by odd or even numbers, check_val looks like a target label, How to understand check_iter? Does it represent dimension? How does it correspond to class label on RGB images.
Hope to receive your reply, thank you.

@anhquancao
Copy link
Collaborator

Hi, I used the code from this source. I guess the implementation of the RLE format is based on the method described in the initial SSC paper.

@JJY0710
Copy link
Author

JJY0710 commented May 29, 2024

Hi, I used the code from this source. I guess the implementation of the RLE format is based on the method described in the initial SSC paper.

Now that I have RGB and depth images captured by a depth camera, as well as semantic information annotated on the images, how should I train your project.

@anhquancao
Copy link
Collaborator

To incorporate depth information into a 3D scene, similar to the approach used in BundleFusion, start by integrating the depth data. After this integration, convert the scene into a voxelized format. Assign semantic labels to each voxel using the semantic labels derived from the corresponding images.

@JJY0710
Copy link
Author

JJY0710 commented Jul 2, 2024

To incorporate depth information into a 3D scene, similar to the approach used in BundleFusion, start by integrating the depth data. After this integration, convert the scene into a voxelized format. Assign semantic labels to each voxel using the semantic labels derived from the corresponding images.

ok,thank you very much for your answer. I will try the method you mentioned

@xyIsHere
Copy link

To incorporate depth information into a 3D scene, similar to the approach used in BundleFusion, start by integrating the depth data. After this integration, convert the scene into a voxelized format. Assign semantic labels to each voxel using the semantic labels derived from the corresponding images.

I learned a lot from your response. By the way, could you show more detail about converting the scene into a voxelized format?

@anhquancao
Copy link
Collaborator

Hi @xyIsHere, I suggest using this repository: https://github.com/andyzeng/tsdf-fusion-python. It converts a set of depth and RGB images into a voxelized TSDF scene.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants