Sonu Gupta's repositories
tosdr-terms-of-service-corpus
This repository contains python code to create a corpus of 12,215 terms of service documents scraped from TOSDR, intended for legal, privacy, and natural language processing research.
Doxing-on-Twitter
This repository contains my work on the prevention and anonymization of dox content on Twitter. It contains python code and demo of the proposed solution.
British-Airway-Virtual-Internship
This repository contains solutions to the 2 different tasks that must be performed during the data science virtual internship provided by British Airways via Forage.
cracking-the-data-science-interview
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Synthetic-financial-data
This repository contains python code used to create synthetic data samples of minority class for a financial dataset. It also contains a sample of generated synthetic data.
Time-Series-Analysis-and-Anomaly-Detection
This repository contains code to perform EDA, outlier detection and forcasting on a multivariate time series.
GPI--corpus
A corpus of privacy laws, regulations, and guidelines used in our paper "Creation and Analysis of an International Corpus of Privacy Laws".
langchain-chat-with-txt-files
Learning and building LLM application using Langchain 🦜🔗 and Open AI
neosemantics
Graph+Semantics: Import/Export RDF from Neo4j. Model mapping, inferencing and more.... If you like it, please ★ ⇧
PrivacyQA
Unofficial model implementations for the PrivacyQA benchmark (https://github.com/AbhilashaRavichander/PrivacyQA_EMNLP)
sonu-gupta.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
ThinkStats2
Text and supporting code for Think Stats, 2nd Edition
word_cloud
A little word cloud generator in Python