Harshit Gupta's repositories
Tiny-ImageNet-200
Code for getting above 50% validation and test accuracy. The preprocessing and other such tasks use excessive RAM.
HiPAMA_Replicate
This repository is the implementation of the HiPAMA architecture, introduced in the paper, Hierarchical Pronunciation Assessment with Multi-Aspect Attention (ICASSP 2023).
Multi-Document-Summarization
Our project addresses the challenge of multi-document summarization with Large Language Models (LLMs), which are constrained by token length limitations. We propose a novel approach that combines the strengths of LLMs and Maximal Marginal Relevance (MMR).
Scratching_ELMo
Using PyTorch, we create an ELMo architecture from scratch and put it into practise for a multi-class sentence classification challenge.
Academic-Dump
Repository of my submissions for numerous side assignments and projects I completed while attending IIITH
BombShell
This repository aims to create a user-defined interactive shell program that can create and manage new processes. This was done as a part of the Monsoon'22 course 'Operating Systems and Networks (OSN)'.
COVID_Tracing_Project
We use C programming language and the knowledge of Data Structures in order to build a real-time Covid Tracing and Tracking model
Cryptocurrency-Backend
We design and implement an application that provides information about cryptocurrency. The goal is to create a functional backend system that retrieves and updates cryptocurrency data.
Deep-Image-Tasks
In this repository we perform various image processing and downstream tasks using various algorithms and approaches.
Exemplar-Guided-Paraphrase-Generation
This repository aims to do Exemplar-Guided Paraphrase Generation (EGPG). The goal is to produce a target sentence that matches the style of the provided exemplar while preserving the source sentence's content information. Done as a part of the Monsoon'22 course 'Advanced NLP' (ANLP).
HH-Type-Models-For-Cortical-And-Thalamic-Neurons
In this repository, we develop Hodgkin-Huxley models of different cells taken from different parts of the brain (including the visual and somatosensory cortices) and from different animals (ferrets, rats, etc.) using computer simulations. Done as a part of Monsoon'22 course 'Introduction to Neural and Cognitive Modelling (INCM)'
Pointer-Generated-Summarization
In this repo, we use Pointer Generator Networks for the purpose of summarization.
RL-on-OpenAI-Gym
We Implement algorithms such as: Monte Carlo(on and off policy), Q-Learning, SARSA, Policy Iteration and Value Iteration on OpenAI Gym environments.
Basic_AI_from_scratch
In this repository we implement algorithms such as: KNNs, Decision Trees, Naive Bayes, Gaussian Naive Bayes, Regressions and their variations from scratch without using any in-built libraries.
Clickbaited
Compared the properties of Clickbait titles vs Non-Clickbait titles and classified the data using simple classifier models such as SVMs, Logistic Regression and XGBoost.
CorefRes_Hindi
Coreference resolution is a very important and common task in NLP. Most of the previous work in it has been towards hand crafted features. Along with that, most of the work in coreference resolution has been in English, with little to none work for Hindi. In this work, we explore neural-network based methods on English and Hindi datasets. We also try to leverage BERT to produce contextual word embeddings which boost the performance of the models. We also create a baseline model for neural coreference resolution in Hindi.
CoWintelli_Chatbot
The primary focus of the bot is to guide people in the ongoing pandemic, answer questions regarding vaccinations, symptoms, live covid statistics, probability of being infected to name a few.
Cross-Lingual-Fact-Checking
This is the code repository for the paper 'Cross-Lingual Fact Checking: Automated Extraction and Verification of Information from Wikipedia using References'
Gates_From_Scratch
The repo contains the complete code for building Language Models based on RNN, LSTMs and GRUs from scratch. Rather than just simply using the pytorch (nn.RNN, nn.LSTM, nn.GRU), we construct these models from scratch.
keras-nlp
Modular Natural Language Processing workflows with Keras
KN-WB-Smoothing
EuroParl and Medical Abstracts are the two corpora we have. Our aim is to use Kneyser-Ney and Witten-Bell smoothing to create Language Models for both of these corpora. We also compute perplexity scores for each sentence in the EuroParl and Medical Abstracts corpora for each of the above models, as well as an average perplexity score/corpus/LM for the train corpus.
Krash_Of_Klans
We construct a 2D game in Python3 (terminal-based) that is largely influenced by Clash of Clans, in which the user controls the king or queen and moves it up, down, forward, and backward while destroying structures and fighting defences. We use object-oriented programming concepts, and the game is a simplified version of Clash of Clans. The goal of the game is to destroy as many buildings as possible while collecting as much loot as possible. An army of troops will assist the king or queen in cleaning up.
MusiCNN-2022
The main idea behind this repository is 'Comparing Human Perception Of Song Similarity With ML Models'. Done as a part of BRED course Monsoon'22.
Ventral-Cochlear-Nucleus-Neurons
This repository aims to code out and replicate the results from the paper ' The Roles Potassium Currents Play in Regulating the Electrical Activity of Ventral Cochlear Nucleus Neurons.' Done as a part of the Monsoon'22 course 'Introduction to Neural Cognitive Modelling (INCM).'