yangchenghuang's repositories
awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
AzureSearch_JFK_Files
This repo contains the sample code of the Azure Search and Cognitive Services used to provide insights and analysis around the JFK Files.
cerbos
Cerbos is the open core, language-agnostic, scalable authorization solution that makes user permissions and authorization simple to implement and manage by writing context-aware access control policies for your application resources.
DAFT
Dynamic Affine Feature Map Transform
databricks-ci-cd
Databricks CI/CD using Azure DevOps
db-queue
Worker-queue implementation on top of Java and database
decaton
High throughput asynchronous task processing on Apache Kafka
gazelle_plugin
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
git-secrets
Prevents you from committing secrets and credentials into git repositories
holmes-extractor
Information extraction from English and German texts based on predicate logic
image_tabular
Integrate image and tabular data for deep learning
lexpredict-lexnlp
LexNLP by LexPredict
LM_Memorization
Training data extraction on GPT-2
matplotlib_for_papers
Handout for the tutorial "Creating publication-quality figures with matplotlib"
mosaic
An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
negmas
Negotiation Multi-Agent System (A negotiation library designed for situated negotiations within business-like simulations)
NER-RE
A Named Entity Recognition + Entity Linker + Relation Extraction Pipeline built using spacy v3.0. Given a text, the pipeline will extract entities from the text as trained and will disambiguate the entities to its normalized form through an Entity Linker connected to a Knowledge Base and will assign a relation between the entities, if any.
overwatch
Capture deep metrics on one or all assets within a Databricks workspace
piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
python-flask-with-javascript
This repository contains an example app to communicate between JavaScript and Python.
rivet
The open-source visual AI programming environment and TypeScript library
spark-nlp-workshop
Public runnable examples of using John Snow Labs' NLP for Apache Spark.
spark-ocr-workshop
Public runnable examples of using John Snow Labs' OCR for Apache Spark.
spark-search
Spark Search - high performance advanced search features based on Apache Lucene
sparkler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
spline-spark-agent
Spline agent for Apache Spark