chwlsunny's repositories

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

Computer_Vision_primer

计算机视觉入门

Stargazers:0Issues:0Issues:0

CPlusPlusThings

C++那些事

Stargazers:0Issues:0Issues:0

d2-net

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

License:NOASSERTIONStargazers:0Issues:0Issues:0

DeepLearning-500-questions

深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06

License:GPL-3.0Stargazers:0Issues:0Issues:0

deeplearningbook-chinese

Deep Learning Book Chinese Translation

Stargazers:0Issues:0Issues:0

DF-GAN

Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis

Stargazers:0Issues:0Issues:0

MatchZoo-py

Facilitating the design, comparison and sharing of deep text matching models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

mcan-vqa

Deep Modular Co-Attention Networks for Visual Question Answering

License:Apache-2.0Stargazers:0Issues:0Issues:0

Multi-Source-Sound-Localization

This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.

Stargazers:0Issues:0Issues:0

nvim-config

My custom Neovim configuration with full battery for Python, Markdown, LaTeX and more...

License:MITStargazers:0Issues:0Issues:0

openvqa

A lightweight, scalable, and general framework for visual question answering (VQA) research

License:Apache-2.0Stargazers:0Issues:0Issues:0

pytorch-cnn-visualizations

Pytorch implementation of convolutional neural network visualization techniques

License:MITStargazers:0Issues:0Issues:0

PyTorch-GAN

PyTorch implementations of Generative Adversarial Networks.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

pytorch-grad-cam

PyTorch implementation of Grad-CAM

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

PyTorchTricks

Some tricks of pytorch... :star:

Stargazers:0Issues:0Issues:0

ResDAVEnet-VQ

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

ros_exploring

《ROS机器人开发实践》源码

Stargazers:0Issues:0Issues:0

rubi.bootstrap.pytorch

RUBi : Reducing Unimodal Biases for Visual Question Answering

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

Semantics-AssistedVideoCaptioning

Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy

License:MITStargazers:0Issues:0Issues:0

show-control-and-tell

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

speaksee

PyTorch library for Visual-Semantic tasks

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

speech2image

Neural network implementation of a speech to image system. Networks are trained to embed images and corresponding captions to the same vector space.

Stargazers:0Issues:0Issues:0

Up-Down-Captioner

Automatic image captioning model based on Caffe, using features from bottom-up attention.

License:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (40+ datasets).

Stargazers:0Issues:0Issues:0

vqa_lol

Visual Reasoning :

Stargazers:0Issues:0Issues:0

vse_infty

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

License:MITStargazers:0Issues:0Issues:0

VSRN

PyTorch code for ICCV'19 paper "Visual Semantic Reasoning for Image-Text Matching"

Stargazers:0Issues:0Issues:0