Ajit Rajasekharan's repositories

unsupervised_NER

Self-supervised NER prototype - updated version (69 entity types - 17 broad entity groups). Uses pretrained BERT models with no fine tuning. State-of-art performance on 3 biomedical datasets

Language:PythonLicense:MITStargazers:80Issues:3Issues:3

bert_mask

This is an example program illustrating BERTs masked language model.

Language:PythonLicense:MITStargazers:28Issues:0Issues:0

bert_vector_clustering

Clustering learned BERT vectors for downstream tasks like unsupervised NER, unsupervised sentence embeddings etc.

Language:PythonLicense:MITStargazers:10Issues:0Issues:0

codebook_comparisons

Comparison of codebook vectors of autoencoders (DALLE's dVAE vs VQGAN) that map any image to a fixed vocabulary of vectors

Language:PythonLicense:MITStargazers:4Issues:0Issues:0

root

Fine-tuned BERT model for POS tagging

Language:PythonLicense:MITStargazers:3Issues:0Issues:0

JPTDP_wrapper

A http interface wrapper around Dat Quoc Nguyen's Joint POS tagging and Dependency parser.

Language:PythonLicense:NOASSERTIONStargazers:2Issues:1Issues:0

multi_gpu_test

Scripts to set up an nvidia GPU machine (ubuntu)

Language:ShellLicense:MITStargazers:2Issues:0Issues:0

ner_bio_phi_for_phrases

This is a tweaked version of self-supervised NER for tagging phrases

Language:PythonLicense:MITStargazers:2Issues:1Issues:0

simple_sbd

Breaks down paragraph into sentences on period char taking into account not breaking on period in numeric sequences and abbreviations

Language:PythonLicense:MITStargazers:2Issues:0Issues:0
Language:PythonLicense:MITStargazers:1Issues:0Issues:0

huggingface_finetune_wrapper

Simple wrapper to fine tune and test a BERT model for sentence classificaition

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

image_text_redaction

Prototype for image text detection, recognition, and redaction. The models used can detect horizontal print and handwritten text. It cannot detected slanted /curved text etc.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

ner_test

This is a test set to evaluate self-supervised NER. Repository evaluates 11 preprocessed data datasets spanning biomedical domain as well as patient privacy related entities (person,location,organization)

Language:PythonLicense:MITStargazers:1Issues:0Issues:0
Language:PythonLicense:MITStargazers:1Issues:0Issues:0

utils

A mixed grab bag of utilities

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

ajitrajasekharan.github.io

This is a log of what I learn and work I have done that yielded usable results

Language:HTMLStargazers:0Issues:0Issues:0

bert_descriptors

BERT's MLM head model exposed as a service

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

cls_for_ood_detection

For supervised text classification tasks, use of CLS to represent sentence to detect OOD inputs relative to training set. Sentence representations are harvested from a self-supervised model (e.g. BERT)

License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

lapos_server

An existing C++ CRF based POS tagger exposed as a service (suitable for fast POS tagging at scale)

Language:C++License:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

simple_tense_detector

This is a simple present/past tense detector of a sentence using DEP-POS tagger

Language:PythonLicense:MITStargazers:0Issues:0Issues:0