Beast code in Giters

chwlsunny's repositories

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:PythonBSD-3-Clause000

bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Language:Jupyter NotebookMIT000

Computer_Vision_primer

计算机视觉入门

000

CPlusPlusThings

C++那些事

000

d2-net

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

NOASSERTION000

DeepLearning-500-questions

GPL-3.0000

deeplearningbook-chinese

Deep Learning Book Chinese Translation

000

DF-GAN

Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis

000

MatchZoo-py

Facilitating the design, comparison and sharing of deep text matching models.

Apache-2.0000

mcan-vqa

Deep Modular Co-Attention Networks for Visual Question Answering

Apache-2.0000

Multi-Source-Sound-Localization

This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.

000

nvim-config

My custom Neovim configuration with full battery for Python, Markdown, LaTeX and more...

MIT000

openvqa

A lightweight, scalable, and general framework for visual question answering (VQA) research

Apache-2.0000

pytorch-cnn-visualizations

Pytorch implementation of convolutional neural network visualization techniques

MIT000

PyTorch-GAN

PyTorch implementations of Generative Adversarial Networks.

Language:PythonMIT010

pytorch-grad-cam

PyTorch implementation of Grad-CAM

Language:PythonMIT010

PyTorchTricks

Some tricks of pytorch... :star:

000

ResDAVEnet-VQ

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

BSD-3-Clause000

ros_exploring

《ROS机器人开发实践》源码

000

rubi.bootstrap.pytorch

RUBi : Reducing Unimodal Biases for Visual Question Answering

BSD-3-Clause000

Semantics-AssistedVideoCaptioning

Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy

MIT000

show-control-and-tell

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019

BSD-3-Clause000

speaksee

PyTorch library for Visual-Semantic tasks

BSD-3-Clause000

speech2image

Neural network implementation of a speech to image system. Networks are trained to embed images and corresponding captions to the same vector space.

000

Up-Down-Captioner

Automatic image captioning model based on Caffe, using features from bottom-up attention.

MIT000

VASE

Language:Python000

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (40+ datasets).

000

vqa_lol

Visual Reasoning :

000

vse_infty

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

MIT000

VSRN

PyTorch code for ICCV'19 paper "Visual Semantic Reasoning for Image-Text Matching"

000