Longxu Dou's repositories
HIT-SCIR-CoNLL2019
"HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding"-1st system in CoNLL2019 shared task
multispider
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
Paper_Reading
This repos is used for storing my own readed paper. Most of them come from Natural Language Process or Deep Learning related topic.
HIT-SCIR-CoNLL2020
"HIT-SCIR at MRP 2020: Transition-based Parser and Iterative Inference Parser"-3rd system in CoNLL2020 shared task
BLINK
Entity Linker solution
ContextualSP
Multiple paper open-source codes of the Microsoft Research Asia DKI group
DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
edit-distance
Python library for computing edit distance between arbitrary Python sequences.
example-app-editable-dataframe
This is a demo of a dataframe with editable cells, powered by `streamlit-aggrid` from Pablo Fonseca. You can edit the cells by clicking on them and then export your selection to a csv file! 🎈
fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
gazp
Source code for Grounded Adaptation for Zero-shot Executable Semantic Parsing
GENRE
Autoregressive Entity Retrieval
IRNet-1
An algorithm for cross-domain NL2SQL
Megatron-LLM
distributed trainer for LLMs
nepali-translator
Neural Machine Translation on the Nepali-English language pair
picard
PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Research
novel deep learning research works with PaddlePaddle
scaling-with-vocab
📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
semstr
Scheme Evaluation and Mapping for Structural Text Representation
spider-schema-gnn
Author implementation of the paper "Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing"
st-chat
Streamlit Component, for a Chatbot UI
TaBERT
This repository contains source code for the TaBERT model, a pre-trained language model for learning joint representations of natural language utterances and (semi-)structured tables for semantic parsing. TaBERT is pre-trained on a massive corpus of 26M Web tables and their associated natural language context, and could be used as a drop-in replacement of a semantic parsers original encoder to compute representations for utterances and table schemas (columns).
tensor2struct-public
Semantic parsers based on encoder-decoder framework