liuhuanyong

liuhuanyong

Geek Repo

Company:China

Location:Beijing, China

Home Page:https://liuhuanyong.github.io

Github PK Tool:Github PK Tool

liuhuanyong's repositories

QASystemOnMedicalKG

A tutorial and implement of disease centered Medical knowledge graph and qa system based on it。知识图谱构建,自动问答,基于kg的自动问答。以疾病为中心的一定规模医药领域知识图谱,并以该知识图谱完成自动问答与分析服务。

CrimeKgAssitant

Crime assistant including crime type prediction and crime consult service based on nlp methods and crime kg,罪名法务智能项目,内容包括856项罪名知识图谱, 基于280万罪名训练库的罪名预测,基于20W法务问答对的13类问题分类与法律资讯问答功能.

TextGrapher

Text Content Grapher based on keyinfo extraction by NLP method。输入一篇文档,将文档进行关键信息提取,进行结构化,并最终组织成图谱组织形式,形成对文章语义信息的图谱化展示。

EventTriplesExtraction

An experiment and demo-level tool for text information extraction (event-triples extraction), which can be a route to the event chain and topic graph, 基于依存句法与语义角色标注的事件三元组抽取,可用于文本理解如文档主题链,事件线等应用。

ChineseSemanticKB

ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典,包括34万抽象语义库、34万反义语义库、43万同义语义库等,可支持句子扩展、转写、事件抽象与泛化等多种应用场景。

DomainWordsDict

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。涵盖68个领域、共计916万词的专业词典知识库,可用于文本分类、知识增强、领域词汇库扩充等自然语言处理应用。

ChainKnowledgeGraph

ChainKnowledgeGraph, 产业链知识图谱包括A股上市公司、行业和产品共3类实体,包括上市公司所属行业关系、行业上级关系、产品上游原材料关系、产品下游产品关系、公司主营产品、产品小类共6大类。 上市公司4,654家,行业511个,产品95,559条、上游材料56,824条,上级行业480条,下游产品390条,产品小类52,937条,所属行业3,946条。

MedicalNamedEntityRecognition

Medical Named Entity Recognition implement using bi-directional lstm and crf model with char embedding.CCKS2017中文电子病例命名实体识别项目,主要实现使用了基于字向量的四层双向LSTM与CRF模型的网络.该项目提供了原始训练数据样本(一般醒目,出院情况,病史情况,病史特点,诊疗经过)与转换版本,训练脚本,预训练模型,可用于序列标注研究.把玩和PK使用.

liuhuanyong.github.io

面向中文自然语言处理的六十余类实践项目及学习索引,涵盖语言资源构建、社会计算、自然语言处理组件、知识图谱、事理图谱、知识抽取、情感分析、深度学习等几个学习主题。包括作者个人简介、学习心得、语言资源、工业落地系统等,是供自然语言处理入门学习者的一个较为全面的学习资源,欢迎大家使用,并提出批评意见。

Language:CSSStargazers:365Issues:6Issues:0

PersonGraphDataSet

PersonGraphDataSet, nearly 10 thousand person2person relationship facts。 人物图谱数据集,近十万的人物关系图谱事实数据库,通过人物关系抽取算法抽取+人工整理得出,可用于人物关系搜索、查询、人物关系多跳问答,以及人物关系推理等场景提供基础数据。

RAGOnMedicalKG

RAGOnMedicalKG,将大模型RAG与KG结合,完成demo级问答,旨在给出基础的思路。

Language:PythonStargazers:95Issues:0Issues:0

QueryCorrection

self complemented SpellCorrection based pinyin similairity, edit distance ,基于拼音相似度与编辑距离的查询纠错。

SinglepassTextCluster

SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec,which can be used for individual real-time corpus cluster task。基于single-pass算法**的自动文本聚类小组件,内置tfidf和doc2vec两种文本向量方法,可自动输出聚类数目、类簇文档集合和簇类大小,用于自有实时数据的聚类任务。

Language:PythonStargazers:55Issues:3Issues:0

ZhuguanDetection

Chinese Subjective Dectection based on subjective knowlegebase, 中文主观性计算。基于中文主观性知识库的句子主观性评定方法。

Language:PythonStargazers:49Issues:3Issues:0

CommonSchemaKG

schemakg, a knowledge graph for schema that seeks to cover a range of things as much as possible including entity schema and event schema。试图构建起覆盖度尽可能广的schema体系,包括实体以及事件。

CausalEventPairsDataset

CausalDataset,因果事件对,基于非结构化新闻网页文本中进行抽取得到,目前开放100688条样本,可用于搭建因果事件图谱

DescriptionKBExtraction

DescriptionPairsExtraction, entity and it's description pairs extract program based on Albert and data back-annotation. 基于Albert与结构化数据回标思路的实体概念描述知识对抽取项目,可进一步验证基于Albert的应用可能性以及数据反标下的快速数据训练。

Language:PythonStargazers:20Issues:4Issues:0

FinanceEventGraph

FinanceEventGraph,金融领域事件图谱开放数据集,可用于事件图谱搭建于实验,包括3865个acquire并购事件、9093个invest投资事件,总计12960的事件

Stargazers:18Issues:0Issues:0

Seq2seqAttGeneration

Seq2seqAttGeneration, an basic implementation of text generation that using seq2seq attention model to generate poem series. this project is based on Keras, can be used as a toturial

Language:PythonStargazers:17Issues:3Issues:0

Seq2seqGeneration

KerasSeq2seqGeneration, an basic implementation of text generation that using seq2seq model to generate poem series. this project is based on Keras, can be used as a toturial

Language:PythonStargazers:8Issues:3Issues:0

Awesome-Chart-Understanding

A curated list of recent and past chart understanding work based on our survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models.

Stargazers:4Issues:0Issues:0

OneChart

offical code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"

License:Apache-2.0Stargazers:2Issues:0Issues:0

TableStructureRec

整理目前开源的表格识别模型,完善前后处理,模型转换为ONNX

License:Apache-2.0Stargazers:2Issues:0Issues:0

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

License:Apache-2.0Stargazers:1Issues:0Issues:0

RapidOCRPDF

Based on RapidOCR, extract the PDF content.

License:Apache-2.0Stargazers:1Issues:0Issues:0

RapidStructure

版面分析 | 表格识别 | 文档方向分类

License:Apache-2.0Stargazers:1Issues:0Issues:0

ChartVLM

Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

HiQA

Code implement reposity of Paper HiQA

License:Apache-2.0Stargazers:0Issues:0Issues:0

RapidLaTeXOCR

Formula recognition based on LaTeX-OCR and ONNXRuntime.

License:MITStargazers:0Issues:0Issues:0