Shahrukh Khan's repositories
multitask-learning-transformers
A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.
multilingual-pdf2text
A python library for extracting text from PDFs without losing the formatting of the PDF content.
siamese-nn-semantic-text-similarity
A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task, including architectures such as: Siamese LSTM Siamese BiLSTM with Attention Siamese Transformer Siamese BERT.
adversarial-ml-101
A beginner friendly repository for getting started with adversarial machine learning in PyTorch
bert-probe
BERT Probe: A python package for probing attention based robustness to character and word based adversarial evaluation. Also, with recipes of implicit and explicit defenses against character-level attacks.
schema-aware-denoising-text2sql
Using Seq2Seq transformers for Text2SQL task on WikiSQL dataset.
applied-transformers
A playground-like experimental project to explore various transformer architectures from scratch.
joint-learn
A PyTorch based comprehensive toolkit for weight-sharing in text classification setting.
antichat_bot
A karma mining bot for antichat.me app!
hydra-nlp-neural-nets
A hydra based Natural Language Processing (NLP) pipeline boilerplate with loggers embbeded, allowing for streamlining deep learning models and accomodating experimentation while also being able to write modular scalable code. The best feature is that the code is completely parametrized via config file, which minimizes code changes when data changes etc.
nnti_hindi_bengali_sentiment_analysis
Our code for training Word2Vec word embeddings for Hindi HASOC dataset then we use BiLSTM with self-attention in joint dual input learning setting where we train a single neural network on Hindi and Bengali dataset simultaneously using their respective embeddings and an LSTM in transfer learning setting.
reproducible-dl-course
The aim of this practical course is to start from a simple deep learning model implemented in a notebook, and port it to a ‘reproducible’ world by including code versioning (Git), data versioning (DVC), experiment logging (Weight & Biases), hyper-parameter tuning, configuration (Hydra), and ‘Dockerization’.
benchmarking_platform
Porting rdkit from python2 to python3 for virtual screening in drug discovery.
bert-loves-chemistry
bert-loves-chemistry: a repository of HuggingFace models applied on chemical SMILES data for drug design, chemical modelling, etc.
dl-courses
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
document-filing-app-angular-4
A documents filing application built in Angular 4 and Bootstrap 4
dropwizard-java-elasticsearch
Restful java service for querying elastic search.
dsa-leetcode
Code snippets for interviews practice from Leet code and misc. sources
git-scm.com
The git-scm.com website. Note that this repository is only for the website; issues with git itself should go to https://git-scm.com/community.
is-arduino-morse-code
Project repository for Interactive systems.
mixup-text
Exploring mixup strategies for text classification
multitask-transformer-qa
Code repo for training and inferencing Multitask QA transformer for Extractive QA and Boolean QA.
shahrukhx01.github.io
My personal data science projects portfolio website.
transformers-bisected
A repo containing all building blocks of transformer model for text classification in Pytorch.