Kumar Ashutosh (thechargedneutron)

thechargedneutron

Geek Repo

Company:UT Austin

Location:Austin, TX

Home Page:thechargedneutron.github.io

Twitter:@chargedneutron_

Github PK Tool:Github PK Tool

Kumar Ashutosh's starred repositories

Language:PythonLicense:BSD-3-ClauseStargazers:1683Issues:0Issues:0

fiftyone

The open-source tool for building high-quality datasets and computer vision models

Language:PythonLicense:Apache-2.0Stargazers:7915Issues:0Issues:0
Language:PythonLicense:MITStargazers:597Issues:0Issues:0

IDM-VTON

IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Language:PythonStargazers:3138Issues:0Issues:0

TaskGraph

Official code repository for "Video-Mined Task Graphs for Keystep Recognition in Instructional Videos" arXiv, 2023

Language:PythonLicense:NOASSERTIONStargazers:9Issues:0Issues:0

AStar

A 2D A Star (A*) pathfinding implementation in C# focused on ease of use

Language:C#License:MITStargazers:126Issues:0Issues:0

basic-pitch

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Language:PythonLicense:Apache-2.0Stargazers:3156Issues:0Issues:0

Multimodal-Graph-Script-Learning

Non-Sequential Graph Script Induction via Multimedia Grounding (ACL 2023)

Language:PythonLicense:MITStargazers:11Issues:0Issues:0

arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Language:PythonLicense:Apache-2.0Stargazers:5037Issues:0Issues:0

VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Language:Jupyter NotebookLicense:MITStargazers:221Issues:0Issues:0

VALOR

Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Language:PythonLicense:MITStargazers:248Issues:0Issues:0

HierVL

[CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings

Language:PythonLicense:NOASSERTIONStargazers:42Issues:0Issues:0

CA-SUM

A PyTorch Implementation of CA-SUM from "Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames", Proc. ACM ICMR 2022

Language:PythonLicense:NOASSERTIONStargazers:26Issues:0Issues:0

videojs-annotation-comments

A plugin for video.js to add support for timeline moment/range comments and annotations

Language:JavaScriptLicense:NOASSERTIONStargazers:167Issues:0Issues:0

untrunc

Restore a truncated mp4/mov. Improved version of ponchio/untrunc

Language:C++License:GPL-2.0Stargazers:1907Issues:0Issues:0
Language:PythonLicense:MITStargazers:18Issues:0Issues:0

CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Language:Jupyter NotebookLicense:MITStargazers:23733Issues:0Issues:0

PreSumm

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders

Language:PythonLicense:MITStargazers:1277Issues:0Issues:0

BRIO

ACL 2022: BRIO: Bringing Order to Abstractive Summarization

Language:PythonStargazers:325Issues:0Issues:0

TransformerSum

Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.

Language:PythonLicense:GPL-3.0Stargazers:424Issues:0Issues:0

MatchSum

Code for ACL 2020 paper: "Extractive Summarization as Text Matching"

Language:PythonStargazers:519Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:129544Issues:0Issues:0

SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Language:PythonLicense:MITStargazers:3320Issues:0Issues:0

clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Language:PythonLicense:NOASSERTIONStargazers:12301Issues:0Issues:0

EgoVLP

[NeurIPS2022] Egocentric Video-Language Pretraining

Language:PythonStargazers:220Issues:0Issues:0

pyskl

A toolbox for skeleton-based action recognition.

Language:PythonLicense:Apache-2.0Stargazers:910Issues:0Issues:0

openpose

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Language:C++License:NOASSERTIONStargazers:30483Issues:0Issues:0
Language:PythonLicense:MITStargazers:1286Issues:0Issues:0

UniVL

An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"

Language:PythonLicense:MITStargazers:332Issues:0Issues:0

PhraseCutDataset

Dataset API for "PhraseCut: Language-based Image Segmentation in the Wild"

Language:Jupyter NotebookStargazers:99Issues:0Issues:0