hellock / cvbase

Utils for computer vision research.

Home Page:http://cvbase.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

image warp function

lxtGH opened this issue · comments

Hi ! Thanks for your code.
I want to warp image(feature map) to the next use optical flow, How can I do that use your code?

cvbase is a library independent to deep learning frameworks and provides some commonly used utils. To implement the warp function, you may reply on a certain framework, here is a pytorch example.

# update at 22/08/2018 with pytorch>=0.4.0
def flow_warp(x, flow, padding_mode='zeros'):
    """Warp an image or feature map with optical flow
    Args:
        x (Tensor): size (n, c, h, w)
        flow (Tensor): size (n, 2, h, w), values range from -1 to 1 (relevant to image width or height)
        padding_mode (str): 'zeros' or 'border'

    Returns:
        Tensor: warped image or feature map
    """
    assert x.size()[-2:] == flow.size()[-2:]
    n, _, h, w = x.size()
    x_ = torch.arange(w).view(1, -1).expand(h, -1)
    y_ = torch.arange(h).view(-1, 1).expand(-1, w)
    grid = torch.stack([x_, y_], dim=0).float().cuda()
    grid = grid.unsqueeze(0).expand(n, -1, -1, -1)
    grid[:, 0, :, :] = 2 * grid[:, 0, :, :] / (w - 1) - 1
    grid[:, 1, :, :] = 2 * grid[:, 1, :, :] / (h - 1) - 1
    grid += 2 * flow
    grid = grid.permute(0, 2, 3, 1)
    return F.grid_sample(x, grid, padding_mode=padding_mode)

# pytorch 0.3
def flow_warp(x, flow, padding_mode='zeros'):
    """Warp an image or feature map with optical flow
    Args:
        x (Variable): size (n, c, h, w)
        flow (Variable): size (n, 2, h, w), values range from -1 to 1 (relevant to image width or height)
        padding_mode (str): 'zeros' or 'border'

    Returns:
        Variable: warped image or feature map
    """
    assert x.size()[-2:] == flow.size()[-2:]
    n, _, h, w = x.size()
    x_ = torch.arange(w).view(1, -1).expand(h, -1)
    y_ = torch.arange(h).view(-1, 1).expand(-1, w)
    grid = torch.stack([x_, y_], dim=0).float().cuda()
    grid = grid.unsqueeze(0).expand(n, -1, -1, -1)
    grid[:, 0, :, :] = 2 * grid[:, 0, :, :] / (w - 1) - 1
    grid[:, 1, :, :] = 2 * grid[:, 1, :, :] / (h - 1) - 1
    grid = Variable(grid)
    grid += 2 * flow
    grid = grid.permute(0, 2, 3, 1)
    return F.grid_sample(x, grid, padding_mode=padding_mode)

thanks for your reply, but I found the dim of x and grid that doesn't match for the input of function F.grad_sample. Is x the input feature or image?

x can be either the image or input feature, as long as the shape of x is (n, c, h, w).

Hi @lxtGH , I found some typo in my examples, x, y should be x_ and y_ to avoid name conflict.

thanks a lot for your code @hellock , I think edit "values range from 0 to 1" to"values range from -1 to 1" will be better, because I add width then dive 2*width to make the flow strictly range from 0 to 1, but it's wrong. Actually all you have to do is just dive width.

Thanks for @hellock code. I think the "values range from 0 to 1" should be edited in "values range from -1 to 1" because of the function's demand https://pytorch.org/docs/0.3.1/nn.html#grid-sample. And there is a small issue that should you bound the value of flow before "grid += 2 * flow"?

@PK15946 @CJEQ Thanks for your comments. The range of flow values should be [-1, 1] and I will update the code snippet.
@CJEQ do you mean clip the flow values in case of NaN?

@hellock Yes. grid += 2 * flow may cause the value out of bound. So I think it would be better to set grid += 2 * flow before grid[:, 0, :, :] = 2 * grid[:, 0, :, :] / (w - 1) - 1, grid[:, 1, :, :] = 2 * grid[:, 1, :, :] / (h - 1) - 1. Is this make sense?

@CJEQ The warped coordinate are not necessary to be restricted between [-1, 1], and it is usual that it may exceeds the image boundary. The argument padding_mode of grid_sample will handle such cases. The values of flow here are computed relative to the height/width of images or feature maps, so grid += 2 * flow cannot be moved before generating a uniform grid.

@hellock There may be some bugs in the codes:
grid = grid.unsqueeze(0).expand(n, -1, -1, -1)
grid[:, 0, :, :] = 2 * grid[:, 0, :, :] / (w - 1) - 1
grid[:, 1, :, :] = 2 * grid[:, 1, :, :] / (h - 1) - 1
grid += 2 * flow

since expand would not allocate new memory, thus when batch size (i.e. n) is greater than 1, grid will add up n times with 2 * flow for each item in the batch, which is unreasonable. I think use another variable like grid_x = grid + 2 * flow , or modify as flow = flow * 2 + grid and then call grid_sample function with flow, or use repeat instead of expand would be better.

Hi @hellock , thanks for your code sharing. You point out that the input flow should range from -1 to 1 (relevant to image width or height), does that mean I should divide the original flow by width and height? (In my case the flow is bounded within +- 20). And can you tell me why do you multiply flow by 2 (grid += 2 * flow). Thanks

Hi! I was using this function for warping batches of images and noticed that with batch size greater than 1, there seems to be bleeding of information across the warped images (the flows seem to "contaminate" other images in the batch so that each warp does not only affect its corresponding one of the n images). I was wondering if someone could looking into this issue or point to what could be causing this? Thanks!

@juliagong You may refer to my last comment.

Thanks, @mikirui! I ended up making my way to this conclusion before seeing your earlier comment, which was indeed exactly the problem. Thanks for pointing it out. Hopefully, others with the same issue will find it faster!

@CJEQ "Yes. grid += 2 * flow may cause the value out of bound. So I think it would be better to set grid += 2 * flow before grid[:, 0, :, :] = 2 * grid[:, 0, :, :] / (w - 1) - 1, grid[:, 1, :, :] = 2 * grid[:, 1, :, :] / (h - 1) - 1. Is this make sense?"
yes you are right. grid += 2 * flow should be before grid normalization. otherwise it is not giving correct warpped output.

@MSLAwan Could you tell me why we need use grid+=2*flow instead of grid+=flow? I am very confused