IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Home Page:https://arxiv.org/abs/2401.14159

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

on Linux:NameError: name '_C' is not defined

artificialzjy opened this issue · comments

I followed the instruction of the readme,but when i try the first demo,i got the error: Failed to load custom C++ ops. Running on CPU mode Only! and NameError: name '_C' is not defined.

I install the torch,torchaudio,torchvision with pip.

getting a similar error, I believe it has to do with the cuda version, but not sure how to solve it. Running CUDA 12-2

I upgraded the torch version to 2.2.1, which resolved the issue. I hope this solution is beneficial to you all.

I am getting the same error. I have installed using docker within an Ubuntu 20.04 WSL2 distribution.

nvcc --version returns release 11.6 (CUDA version)

This error is resulting due to a problem loading the custom C++ operations required by the GroundingDINO model. The warning message "Failed to load custom C++ ops. Running on CPU mode Only!" suggests that the necessary compiled C++ operations were not found or could not be loaded.

There are some other requirements and modules needed by GroundingDINO. These can be installed using the requirements.txt and setup.py files inside of the GroundingDINO directory. Navigate to it and run:

pip install -r requirements.txt and
python setup.py install

All necessary prerequisites should be now installed, navigate back to the parents directory and try running the demo again.

As a quick fix (or another solution that goes into a different direction) that I had to come up with because I had to make it work in an environment where I could not install CUDA tools, you can change the GroundingDINO ms_deform_attention.py code to use multi_scale_deformable_attn_pytorch.

The Grounding DINO code should work fine after that, but I have to admit that I did not test it thoroughly and that I am not sure whether the PyTorch implementation has any disadvantages over their CUDA build implementation in terms of runtime on a GPU. To be more precise:

  1. In Grounded-Segment-Anything/GroundingDINO/groundingdino/models/GroundingDINO/ms_deform_attn.py, delete the code on lines 28 to 30, i.e.,
try:
    from groundingdino import _C
except:
    warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
  1. Delete the MultiScaleDeformableAttnFunction implementation on lines 41 to 90 i.e.
lass MultiScaleDeformableAttnFunction(Function):
    @staticmethod
    def forward(
        ctx,
        value,
        value_spatial_shapes,
        value_level_start_index,
        sampling_locations,
        attention_weights,
        im2col_step,
    ):
        ctx.im2col_step = im2col_step
        output = _C.ms_deform_attn_forward(
            value,
            value_spatial_shapes,
            value_level_start_index,
            sampling_locations,
            attention_weights,
            ctx.im2col_step,
        )
        ctx.save_for_backward(
            value,
            value_spatial_shapes,
            value_level_start_index,
            sampling_locations,
            attention_weights,
        )
        return output

    @staticmethod
    @once_differentiable
    def backward(ctx, grad_output):
        (
            value,
            value_spatial_shapes,
            value_level_start_index,
            sampling_locations,
            attention_weights,
        ) = ctx.saved_tensors
        grad_value, grad_sampling_loc, grad_attn_weight = _C.ms_deform_attn_backward(
            value,
            value_spatial_shapes,
            value_level_start_index,
            sampling_locations,
            attention_weights,
            grad_output,
            ctx.im2col_step,
        )

        return grad_value, None, None, grad_sampling_loc, grad_attn_weight, None
  1. Change the forward function of the MultiScaleDeformableAttention Module to use the pytorch implementation i.e. change on line 329 to 352 from
if torch.cuda.is_available() and value.is_cuda:
            halffloat = False
            if value.dtype == torch.float16:
                halffloat = True
                value = value.float()
                sampling_locations = sampling_locations.float()
                attention_weights = attention_weights.float()

            output = MultiScaleDeformableAttnFunction.apply(
                value,
                spatial_shapes,
                level_start_index,
                sampling_locations,
                attention_weights,
                self.im2col_step,
            )

            if halffloat:
                output = output.half()
        else:
            output = multi_scale_deformable_attn_pytorch(
                value, spatial_shapes, sampling_locations, attention_weights
            )

to

output = multi_scale_deformable_attn_pytorch(
            value, spatial_shapes, sampling_locations, attention_weights
        )

Note that you probably need to run pip uninstall and reinstall the library after that.