karpathy / neuraltalk

NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about usage of RCNN

jazzsaxmafia opened this issue · comments

Hello, I recently read your paper, and very much appreciate about you sharing your codes here.

By the way, on your paper it is indicated that you first extracted top regions of obtained by RCNN and then get the CNN features, however I do not see that object detection part in your implementation. Either in training and test phase, it seems not using object detection functionality. Is it because it still works fine using the holistic image?

Thank you.

Yeah I didn't fold this code into the code release that is NeuralTalk. I only took a small chunk of my paper, the one that predicts sequence of words for a block of pixels (whole image here).

commented

@karpathy Hi Andrej, Thank you very much for sharing the code. Do you plan to share the code for the object detection part sometime in the future?

Thanks

commented

@jazzsaxmafia i have same question with you .Have you known how to detect object in images in order to generate a set of h-dimentional representation for every image

commented

@mlguy i have same question with you .Have you known how to detect object in images in order to generate a set of h-dimentional representation for every image

commented

@karpathy Hi Andrej, Thank you very much for sharing the code. Can you share the code for the object detection so that I can generate a set of h-dimentional representation for every image

Thanks