陈明磊's repositories
mimir
Multi-paradigm Information Management Index and Repository
dlib
A toolkit for making real world machine learning and data analysis applications in C++
NLP-Cube
Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
gate-lf-pytorch-json
PyTorch wrapper for the LearningFramework GATE plugin
tools
Various utilities for processing the data.
udpipe
UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
gateplugin-Format_Brat
Support for loading/saving brat standoff annotations
pytext
A natural language modeling framework based on PyTorch
gateplugin-Linguistic_Simplifier
Linguistic based techniques for text simplification
vue-quill-editor
🍡@quilljs editor component for @vuejs
texar
Toolkit for Text Generation and Beyond
beam
A distributed knowledge graph store
QAonMilitaryKG
QAonMilitaryKG,QaSystem based on military knowledge graph that stores in mongodb which is different from the previous one, 基于mongodb存储的军事领域知识图谱问答项目,包括飞行器、太空装备等8大类,100余小类,共计5800项的军事武器知识库,该项目不使用图数据库进行存储,通过jieba进行问句解析,问句实体项识别,基于查询模板完成多类问题的查询,主要是提供一种工业界的问答**demo。
cocoapi
COCO API - Dataset @ http://cocodataset.org/
carbondata
Mirror of Apache CarbonData
skywalking
APM, Application Performance Monitoring System
MachineLearning-1
《机器学习实战》一书源码下载
brat
brat rapid annotation tool (brat) - for all your textual annotation needs
microsoft-todo-mac
💻Microsoft-ToDo macOS client based on Electron & Microsoft REST API.
vue2-multi-uploader
Drag and drop multiple file uploader with Vue.js v2 and Axios
show-control-and-tell
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions. CVPR 2019
canal
阿里巴巴mysql数据库binlog的增量订阅&消费组件 。阿里云DRDS( https://www.aliyun.com/product/drds )、阿里巴巴TDDL 二级索引、小表复制powerd by canal. Aliyun Data Lake Analytics https://www.aliyun.com/product/datalakeanalytics powered by canal
fairseq-1
Facebook AI Research Sequence-to-Sequence Toolkit
NCRFpp
NCRF++, an Open-source Neural Sequence Labeling Toolkit. It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components. (code for COLING/ACL 2018 paper)
yugong
阿里巴巴去Oracle数据迁移同步工具(全量+增量,目标支持MySQL/DRDS)
webpage-capture
Webpage snapshot API using puppeteer.
ChineseNLPCorpus
An collection of Chinese nlp corpus including basic Chinese syntatic wordset, semantic wordset, historic corpus and evaluate corpus. 中文自然语言处理的语料集合,包括语义词、领域共时、历时语料库、评测语料库等。
ChineseTextualInference
ChineseTextualInference project including chinese corpus build and inferecence model, 中文文本推断项目,包括88万文本蕴含中文文本蕴含数据集的翻译与构建,基于深度学习的文本蕴含判定模型构建.
TopicCluster
A simple documentary topic analysis implement based on traditional K-means and LDA which can achieve a not-bad result. 基于Kmeans与Lda模型的多文档主题聚类,输入多篇文档,输出每个主题的关键词与相应文本,可用于主题发现与热点分析等应用,如历时话题建模,评论画像等。
ComplexEventExtraction
A concept and obvious expression pattern collection of Chinese compound event extraction which then be evolved into ComplexEventGraph,本项目提出了中文复合事件的概念与显式模式,包括条件事件、因果事件、顺承事件、反转事件等事件抽取,并形成事理图谱。