Hang Dong's starred repositories

faiss

A library for efficient similarity search and clustering of dense vectors.

tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Language:RustLicense:Apache-2.0Stargazers:8737Issues:122Issues:959

bertviz

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Language:PythonLicense:Apache-2.0Stargazers:6622Issues:71Issues:123

graph-based-deep-learning-literature

links to conference publications in graph-based deep learning

Language:Jupyter NotebookLicense:MITStargazers:4690Issues:247Issues:14

Awesome-Learning-with-Label-Noise

A curated list of resources for Learning with Noisy Labels

brat

brat rapid annotation tool (brat) - for all your textual annotation needs

Language:PythonLicense:NOASSERTIONStargazers:1796Issues:79Issues:1349

anserini

Anserini is a Lucene toolkit for reproducible information retrieval research

Language:JavaLicense:Apache-2.0Stargazers:1004Issues:41Issues:603

MedCAT

Medical Concept Annotation Tool

Language:PythonLicense:NOASSERTIONStargazers:424Issues:23Issues:73

structural-probes

Codebase for testing whether hidden states of neural networks encode discrete structures.

Language:PythonLicense:NOASSERTIONStargazers:377Issues:8Issues:9

clinicalBERT

ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission (CHIL 2020 Workshop)

Language:Jupyter NotebookStargazers:365Issues:14Issues:22

mimic-iv

Deprecated. For the latest MIMIC-IV code, please refer to: https://github.com/MIT-LCP/mimic-code

Language:PythonLicense:MITStargazers:261Issues:41Issues:0

neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

Language:VueLicense:MITStargazers:249Issues:7Issues:5

AttentionXML

Implementation for "AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification"

hnatt

Train and visualize Hierarchical Attention Networks

Language:PythonLicense:MITStargazers:203Issues:12Issues:8

TAKG

The official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"

Language:PythonLicense:MITStargazers:153Issues:4Issues:14

covidex

A multi-stage neural search engine for the COVID-19 Open Research Dataset

Language:TypeScriptLicense:MITStargazers:135Issues:10Issues:35

OWL2Vec-Star

Embedding OWL ontologies

Language:PythonLicense:Apache-2.0Stargazers:81Issues:8Issues:15

upheno

The Unified Phenotype Ontology (uPheno) integrates multiple phenotype ontologies into a unified cross-species phenotype ontology.

Language:MakefileLicense:CC0-1.0Stargazers:75Issues:30Issues:584

MedCATtrainer

A simple interface to inspect, improve and add concepts to biomedical NER+L -> MedCAT.

Language:PythonLicense:NOASSERTIONStargazers:68Issues:10Issues:62

asreview-covid19

Extension that adds Covid-19 related datasets to ASReview

Language:PythonLicense:Apache-2.0Stargazers:27Issues:4Issues:11

gzsl_text

Code for Generalized Zero-Shot Text Classification for ICD Coding (IJCAI 2020)

Language:PythonLicense:Apache-2.0Stargazers:18Issues:8Issues:1

Geonames-embeddings

Embeddings for all geonames populated locations with population greater than 0

License:MITStargazers:12Issues:5Issues:0

collections-as-data

Jupyter Notebooks for reuse in analysis of National Library of Scotland's collections as data

Language:Jupyter NotebookStargazers:8Issues:2Issues:0

Automated-Health-Responses

A prototype project for automated, physician-like responses to medical questions

Language:Jupyter NotebookLicense:MITStargazers:6Issues:3Issues:0

ACTdb

Annotated Clinical Texts from MIMIC

gate-cloud-python-example

example of using the GATE Cloud on-line API

Language:PythonStargazers:3Issues:13Issues:0

domain-specific-bert-scripts

Pretraining/finetuning scripts used for domain specific BERT training.

Language:PythonStargazers:3Issues:4Issues:0

bio-yodie-resource-prep

Scripts to prepare the informational resources required by GATE Bio-YODIE.

Language:ScalaLicense:NOASSERTIONStargazers:2Issues:13Issues:2

cantemist2020-ner

CANTEMIST(CANcer TExt Mining Shared Task – tumor named entity recognition)-NER track

Language:PythonStargazers:2Issues:1Issues:0

Awesome-COVID-NLP-tools

This is a curation of a list of COVID-19 related NLP tools that might be interested to both researcher in COVID-19 and also in the clinical NLP domain.