-[Project](Scene Graph Generator)
- Table of Contents
- About The Project
- Implementation details
- Research papers
- Domains Explored
- Project Workflow
- Future Work
- Courses Referred
- Contributors
- Acknowledgements and Resources
Detecting objects and their relations in images in the form of a graph data structure and generating graphs to represent relations between objects in a given image.
Using YOLO to output the class of the objects and their bounding box coordinates to get their location in the image.
- Creating a 2D array of all the objects detected which contains the word2vec embeddings of the class plus the four bounding box coordinates.
- Passing the input array through two identical neural networks to obtain the key and query matrix
- Multiplying the key and query matrices to generate an attention matrix
- All elements of the attention matrix having value 1, their respective row and column number is assigned the object name and hence we can obtain which all objects have relations between them.
- We then apply non-maximal suppres- sion (NMS) to filter out object pairs that have significant overlap with others. Each relationship has a pair of bounding boxes, and the combination order mat- ters. We compute the overlap between two object pairs {u,v} and {p,q} where operator I computes the intersection area between two boxes and U the union area. The remaining m object pairs are considered as candidates having meaningful relationships E. With E, we obtain a graph G = (V,E), which is much sparser than the original fully connected graph. Along with the edges proposed for the graph, we get the visual representations Xr = {xr1, ..., xrm} for all m relationships by extracting features from the union box of each object pair.
Artificial intelligence,Deep Learning , Neural networks , Python , Libraries such as Pytorch,TensorFlow , Numpy, Pandas
- Learning the basics of deep learning.
- Learning about convolutional Neural Networks and object detection algorithms.
- Learning about YOLO ( you only look once ) and implementing it.
- Learning about RNNs and Attention models
- Researching on REPN (Relationship Proposal Network)
- Implenting REPN from scratch using Pytorch
- Training the RePN on visual genome
- connecting the RePN to the yolo object detection framework to obtain information about the detected objects.
- Implementing an Attention Graph Convolutional Neural Network to display image descriptions as proper sentences.
- Add a better YOLO version (v7 or v8) to improve the model.
- Achive higher accuracy of the model.
- [SRA VJTI][https://sravjti.in/] Eklavya 2023
- Special thanks to our mentors Prit Kanadiya and Raghav Agarwal
- Darknet for yolo object detection