Anish Acharya's repositories
NLP-CS388-UT
Common Libraries developed in "PyTorch" for different NLP tasks. Sentiment Analysis, NER, LSTM-CRF, CRF, Semantic Parsing
DBMS-From-Scratch
Implementation of a DBMS working system ground Up
Bandits-Online-Learning
Simple Implementations of Bandit Algorithms in python
BGMD-AISTATS-2022
Geometric median (GM) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0.5. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) for high-dimensional optimization problems. In this paper, we show that by applying Gm to only a judiciously chosen block of coordinates at a time and using a memory mechanism, one can retain the breakdown point of 0.5 for smooth non-convex problems, with non-asymptotic convergence rates comparable to the SGD with GM.
Online-Embedding-Compression-AAAI-2019
Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training, to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.
DeLiCoCo-IEEE-Transactions
In compressed decentralized optimization settings, there are benefits to having multiple gossip steps between subsequent gradient iterations, even when the cost of doing so is appropriately accounted for e.g. by means of reducing the precision of compressed information.
Image-Segmentation-fractional-filters
Software Related to CVPR 2015 Paper on Image Segmentation Using TRW Belief Propagation based Learning
Cracking-Coding-Interviews
Contains Basic Utility Functions and Common DS and Algo Questions
double-descent
We investigate double descent more deeply and try to precisely characterize the phenomenon under different settings. Specifically, we focus on the impact of label noise and regularization on double descent. None of the existing works consider these aspects in detail and we hypothesize that these play an integral role in double descent.
Expectation-Maximization
Package in Matlab for generating Synthatic Data using GMM and EM Clustering on that
Optimization-Mavericks
This repository provides a unified framework to perform Optimization experiments across Stochastic, Mini-Batch, Decentralized and Federated Setting.
Search-Engine-From-Scratch
implementation of a search engine. Implements a complete search engine for the ICS domain.It also develops a[web] interface that provides the user with a text box to enter queries and returns relevant results
anishacharya
Try out git README
CS273A-UCI-Fall-2013
My First Graduate Machine Learning Course- 2013 Fall :)
deepcluster-contrastive
Deep Clustering for Unsupervised Learning of Visual Features
Federated-Learning-in-PyTorch
Handy PyTorch implementation of Federated Learning (for your painless research)
Learning-with-Audio-Data
My Functions to easily create music dataset from raw signals, run learning algorithms and classify genres
lightly
A python library for self-supervised learning on images.
robust-diffusion
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
SNCLR
[ICLR 2023] Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
vpu
A PyTorch implementation of the Variational approach for PU learning