Tung Thanh Le (ttungl)

ttungl

Geek Repo

Company:NBCUniversal

Location:MN

Home Page:http://ttungl.github.io/

Github PK Tool:Github PK Tool

Tung Thanh Le's starred repositories

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:18886Issues:296Issues:1311

data-science-interviews

Data science interview questions and answers

Language:HTMLLicense:CC-BY-4.0Stargazers:8297Issues:217Issues:16

dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Language:PythonLicense:MITStargazers:6833Issues:137Issues:450

chatgpt-advanced

WebChatGPT: A browser extension that augments your ChatGPT prompts with web results.

Language:TypeScriptLicense:MITStargazers:5289Issues:89Issues:108

EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:2812Issues:70Issues:446

stat_rethinking_2023

Statistical Rethinking Course for Jan-Mar 2023

Language:RLicense:CC0-1.0Stargazers:2155Issues:149Issues:15

pymc-resources

PyMC educational resources

Language:Jupyter NotebookLicense:MITStargazers:1895Issues:65Issues:75

pytorchTutorial

PyTorch Tutorials from my YouTube channel

Language:PythonLicense:MITStargazers:1692Issues:25Issues:17

tfcausalimpact

Python Causal Impact Implementation Based on Google's R Package. Built using TensorFlow Probability.

Language:PythonLicense:Apache-2.0Stargazers:584Issues:13Issues:76

causal-inference-tutorial

Repository with code and slides for a tutorial on causal inference.

Language:Jupyter NotebookStargazers:555Issues:21Issues:9

upliftml

UpliftML: A Python Package for Scalable Uplift Modeling

Language:PythonLicense:Apache-2.0Stargazers:307Issues:13Issues:7
Language:Jupyter NotebookLicense:MITStargazers:265Issues:11Issues:45

awesome-causal-inference

A (concise) curated list of awesome Causal Inference resources.

tableone

R package to create "Table 1", description of baseline characteristics with or without propensity score weighting

window_funcs

A Rust web app to teach SQL window functions

Language:RustLicense:Apache-2.0Stargazers:128Issues:8Issues:9

amazonqa

Evidence-based QA system for community question answering.

Language:Jupyter NotebookStargazers:101Issues:7Issues:6

Cracking_The_Machine_Learning_Interview

(Under Construction) I am currently writing a solution from the Medium article "Cracking the Machine Learning Interview," written by Subhrajit Roy. In the past year since the article went public, Subhrajit has only written down the questions with no update on the solutions. I plan on finishing the war. I may add more questions outside of the articles domain. No one else on the internet has written down a solution for machine learning interview, an opportunity I want to take advantage of.

awesome-Marketing-Analytics

:rotating_light: Resources :briefcase: to learn/practice :dart: Marketing analytics :chart: :rotating_light:

Cracking-The-Machine-Learning-Interview

Code snippets for our Book solutions

Language:PythonStargazers:40Issues:2Issues:0

Scanned-document-classification-deep-learning

BFSI sectors deal with lots of unstructured scanned documents which are archived in document management systems for further use.For example in Insurance sector, when a policy goes for underwriting, underwriters attached several raw notes with the policy, Insureds also attach various kind of scanned documents like identity card, bank statement, letters etc. In later parts of the policy life cycle if claims are made on a policy, releted scanned documents also archeived.Now it becomes a tedious job to identify a particular document from this vast repository. The goal of this case study is to develop a deep learning based solution which can automatically classify scanned documents.

Language:Jupyter NotebookLicense:MITStargazers:35Issues:3Issues:5

Berkeley-Spark

edX:Berkeley:Spark

Language:HTMLStargazers:21Issues:0Issues:0

Document-Image-Classification-with-Intra-Domain-Transfer-Learning-and-Stacked-Generalization-of-Deep

RVL-CDIP could be looked at as the equivalent of ImageNet for the document image community. It’s certainly the largest we’ve seen in the literature. There are 400,000 total document images in the dataset. The dataset contains much noise and variance in composition of each document class. Uncompressed, the dataset size is ~100GB, and comprises 16 classes of document types, with 25,000 samples per classes. Example classes include email, resume, and invoice. Achieved an Accuracy of over 93% which beat the benchmark score of 92% based on https://paperswithcode.com/sota/document-image-classification-on-rvl-cdip

Language:Jupyter NotebookStargazers:16Issues:0Issues:0

sample-size

This python project is a helper package that uses power analysis to calculate required sample size for any experiment

Language:PythonLicense:MITStargazers:12Issues:11Issues:1

Deep-Learning

Implemented the deep learning techniques using Google Tensorflow that cover deep neural networks with a fully connected network using SGD and ReLUs; Regularization with a multi-layer neural network using ReLUs, L2-regularization, and dropout, to prevent overfitting; Convolutional Neural Networks (CNNs) with learning rate decay and dropout; and Recurrent Neural Networks (RNNs) for text and sequences with Long Short-Term Memory (LSTM) networks.

Language:Jupyter NotebookStargazers:11Issues:4Issues:0

coursera-causality-crash-course

A Crash Course in Causality: Inferring Causal Effects from Observational Data

Language:Jupyter NotebookStargazers:8Issues:0Issues:0

HeteroArchGen4M2S

HeteroArchGen4M2S: An automatic software for configuring and running heterogeneous CPU-GPU architectures on Multi2Sim simulator. This tool is built on top of M2S simulator, it allows us to configure various heterogeneous CPU-GPU architectures (e.g., number of CPU cores, GPU cores, L1$, L2$, memory (size and latency (via CACTI 6.5)), network topologies (currently support 2D-Mesh, customized 2D-Mesh, and Torus networks)...). The output files include the results of network throughput and latency, caches/memory access time, and dynamic power of the cores (can be collected after running McPAT).

Language:GLSLLicense:MITStargazers:6Issues:3Issues:0

causal_inference

Coursera course : "A Crash Course in Causality: Inferring Causal Effects from Observational Data"

Language:RStargazers:1Issues:0Issues:0

Causal-Inference-UPenn

Assignment codes for the coursera course "A Crash Course in Causality: Inferring Causal Effects from Observational Data" by UPenn

Language:Jupyter NotebookStargazers:1Issues:0Issues:0

causality-course-coursera

A Crash Course in Causality: Inferring Causal Effects from Observational Data - Coursera

Stargazers:1Issues:0Issues:0

Coursera-A-Crash-Course-in-Causality

My notes and solutions to 'A Crash Course in Causality: Inferring Causal Effects from Observational Data' by Jason A. Roy from University of Pennsylvania.

Language:RStargazers:1Issues:2Issues:0