Venkata Satya Sai Ajay Daliparthi (dvssajay)

dvssajay

Geek Repo

Company:Blekinge Institute of Technology

Location:Sweden

Github PK Tool:Github PK Tool

Venkata Satya Sai Ajay Daliparthi's repositories

PDFNet-Pointwise-Dense-Flow-Network-for-Urban-Scene-Segmentation

Using a deep convolutional neural network (CNN) as a feature encoder (or backbone) is the most commonly observed architectural pattern in several computer vision methods, and semantic segmentation is no exception. The two major drawbacks of this architectural pattern are: (i) the networks often fail to capture small classes such as wall, fence, pole, traffic light, traffic sign, and bicycle, which are crucial for autonomous vehicles to make accurate decisions. (ii) due to the arbitrarily increasing depth, the networks require massive labeled data and additional regularization techniques to converge and to prevent the risk of over-fitting, respectively. While regularization techniques come at minimal cost, the collection of labeled data is an expensive and laborious process. In this work, we address these two drawbacks by proposing a novel lightweight architecture named point-wise dense flow network (PDFNet). In PDFNet, we employ dense, residual, and multiple shortcut connections to allow a smooth gradient flow to all parts of the network. The extensive experiments on Cityscapes and CamVid benchmarks demonstrate that our method significantly outperforms baselines in capturing small classes and in few-data regimes. Moreover, our method achieves considerable performance in classifying out-of-the training distribution samples, evaluated on Cityscapes to KITTI dataset.

Language:PythonLicense:GPL-3.0Stargazers:4Issues:1Issues:0

The-Ikshana-Hypothesis-of-Human-Scene-Understanding

In recent years, deep neural networks (DNNs) achieved state-of-the-art performance on many computer vision tasks. However, the one typical drawback of these DNNs is the requirement of massive labeled data. Even though few-shot learning methods addressed this problem through metric-learning and meta-learning techniques, in this work, we address this problem from a neuroscience perspective. We propose a theory named Ikshana, to explain the functioning of the human brain, while humans understand an image. By following the Ikshana theory, we propose a novel neural-inspired CNN architecture named IkshanaNet for semantic segmentation. The empirical results demonstrate the effectiveness of our method on few data samples, outperforming several baselines, on the Cityscapes and the CamVid benchmarks.

Language:PythonLicense:GPL-3.0Stargazers:2Issues:2Issues:0

Comparing-the-Performance-effect-of-Various-CNN-Architectures-on-Image-Captioning

In our study, we compared the performance effect of four Different CNN Architectures VGG16, DenseNet121, MobileNet, and ResNet50 (Pre-trained on ImageNet dataset) as encoder, while using LSTM as de- coder in Image captioning . The Bilingual Evaluation Understudy (BLEU) metric is used to evaluate the results generated by four models. The mean BLEU scores for the Models are VGG16(0.015), DenseNet121(0.010), MobileNet(0.013) and ResNet50(0.024). We evaluated the models on Flickr8K Dataset. Experimental results show that using ResNet50 as CNN encoder shows a huge difference in performance compared to more recent State-of-the-art image classification Networks like DenseNet121 and MobileNet.

Language:Jupyter NotebookLicense:MITStargazers:1Issues:1Issues:0

Semantic-Segmentation-of-Urban-Scene-Images-Using-Recurrent-Neural-Networks

This study investigates the performance effect of using recurrent neural networks (RNNs) for semantic segmentation of urban scene images, to generate a semantic output map with refined edges. We proposed three deep neural network architectures using recurrent neural networks and evaluated them on the Cityscapes dataset. All three proposed architectures outperformed the baseline and shown improvement in classifying edges. Additionally, we showed a new method for using RNN for any prior semantic segmentation network that makes use of skip connections. PyTorch was the selected framework for conducting this study.

Language:PythonLicense:MITStargazers:1Issues:1Issues:1
Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0
Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0
Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0
Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0

ViSDM-Vision-Sovereignty-Data-Marketplace

A Decentralized Platform for Crowdsourcing Data Collection and Trading

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0
Language:HTMLLicense:MITStargazers:0Issues:1Issues:0
Language:SolidityStargazers:0Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0
License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0