Shan Dou's repositories
search_with_machine_learning_course
Public repository for the Search with Machine Learning course taught by Daniel Tunkelang and Grant Ingersoll. Available at https://corise.com/course/search-with-machine-learning?utm_source=daniel.
haystack
:mag: Haystack is an open source NLP framework that leverages pre-trained Transformer models. It enables developers to quickly implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications.
notebooks
Jupyter notebooks for the Natural Language Processing with Transformers book
search_fundamentals_course
Public repository for the Search Fundamentals course taught by Daniel Tunkelang and Grant Ingersoll. Available at https://corise.com/search-fundamentals?utm_source=daniel.
Clean-Code-in-Python
Clean Code in Python, published by Packt
conda
OS-agnostic, system-level binary package manager and ecosystem
DeepCTR-Torch
【PyTorch】Easy-to-use,Modular and Extendible package of deep-learning based CTR models.
deploying-machine-learning-models
Example Repo for the Udemy Course "Deployment of Machine Learning Models"
dssm
An industrial-grade implementation of DSSM
feature-engineering-for-machine-learning
Code Repository for the online course Feature Engineering for Machine Learning
fuzzywuzzy
Fuzzy String Matching in Python
generative-ai-for-beginners
12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
gensim
Topic Modelling for Humans
gitpod_conda
Stores gitpod configs for using conda
haystack-website
Haystack landing page with documentation, use cases etc.
Machine-Learning-Engineering-with-Python
Machine Learning Engineering with Python
ml-design-patterns
Software Architecture for ML engineers
openai-cookbook
Examples and guides for using the OpenAI API
pdpipe
Easy pipelines for pandas DataFrames.
pecos
PECOS - Prediction for Enormous and Correlated Spaces
python-patterns
A collection of design patterns/idioms in Python
pytorch-widedeep
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
qdrant
Qdrant - Vector Search Engine and Database for the next generation of AI applications. Also available in the cloud https://qdrant.to/cloud
quantulum3
Library for unit extraction under active development - fork of quantulum
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
SpanBERT
Code for using and evaluating SpanBERT.