Danmo121's repositories

entity_extractor_by_ner

基于Tensorflow2.3开发的NER模型,都是CRF范式,包含Bilstm(IDCNN)-CRF、Bert-Bilstm(IDCNN)-CRF、Bert-CRF,可微调预训练模型,可对抗学习,用于命名实体识别,配置后可直接运行。

Stargazers:0Issues:0Issues:0

technology_novelty_search_data_analysis_system

科技查新数据分析系统针对科技报告和文献等数据进行处理和分析。对数据中的关键字段如科学技术要点、项目来源(地域)、项目隶属单位、项目年份、所属学科分类等展开研究。采用TF-IDF算法从科学技术要点中提取出项目的关键词信息,之后采用TextCNN文本分类[3]模型进而完成对文献的学科领域归类。从中抽取单位、学科领域、项目、地域四个实体,选取Neo4j数据库完成科技文献知识图谱[4]的构建工作。用户可以根据自己的需求输入一段查新要点,系统会自动完成关键词的提取以及归类,并为用户推荐最接近查新需求的相关文献,这将大大降低人工查新的工作难度,提高查新效率。

License:MITStargazers:0Issues:0Issues:0

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

License:MITStargazers:0Issues:0Issues:0

nlp-notebook

NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。

License:MITStargazers:1Issues:0Issues:0

nlp-journey

Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.

License:Apache-2.0Stargazers:0Issues:0Issues:0

python-tutorial

Python实用教程,包括:Python基础,Python高级特性,面向对象编程,多线程,数据库,数据科学,Flask,爬虫开发教程。

License:Apache-2.0Stargazers:0Issues:0Issues:0

TextRank4ZH

:deciduous_tree:从中文文本中自动提取关键词和摘要

License:MITStargazers:0Issues:0Issues:0

nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

License:MITStargazers:0Issues:0Issues:0

learn_python3_spider

python爬虫教程系列、从0到1学习python爬虫,包括浏览器抓包,手机APP抓包,如 fiddler、mitmproxy,各种爬虫涉及的模块的使用,如:requests、beautifulSoup、selenium、appium、scrapy等,以及IP代理,验证码识别,Mysql,MongoDB数据库的python使用,多线程多进程爬虫的使用,css 爬虫加密逆向破解,JS爬虫逆向,分布式爬虫,爬虫项目实战实例等

License:MITStargazers:0Issues:0Issues:0

GeoNER

BiLSTM-CRF for CNER

Stargazers:0Issues:0Issues:0

ChineseSemanticKB

ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典,包括34万抽象语义库、34万反义语义库、43万同义语义库等,可支持句子扩展、转写、事件抽象与泛化等多种应用场景。

Stargazers:0Issues:0Issues:0

KnowledgeGraphData

史上最大规模1.4亿中文知识图谱开源下载

Stargazers:0Issues:0Issues:0

pnlp

NLP预/后处理工具。

License:Apache-2.0Stargazers:0Issues:0Issues:0

Deep-Learning-Interview-Book

深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)

Stargazers:0Issues:0Issues:0

PaddleNLP

👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis and 🖼 Diffusion AIGC system etc.

License:Apache-2.0Stargazers:0Issues:0Issues:0

RapidOCR

A cross platform OCR Library based on PaddleOCR & OnnxRuntime & OpenVINO.

License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenKE

An Open-Source Package for Knowledge Embedding (KE)

Stargazers:0Issues:0Issues:0

nlp-paper

自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)

License:Apache-2.0Stargazers:0Issues:0Issues:0

text_analysis_tools

中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)

License:Apache-2.0Stargazers:0Issues:0Issues:0

albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

License:Apache-2.0Stargazers:0Issues:0Issues:0

dh2loop

A package to extract information from drillholes to feed 3D modelling packages

License:MITStargazers:0Issues:0Issues:0

pkuseg-python

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

License:MITStargazers:0Issues:0Issues:0

Takin

A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。

License:MITStargazers:0Issues:0Issues:0

nlp_tutorial

NLP超强入门指南,包括各任务sota模型汇总(文本分类、文本匹配、序列标注、文本生成、语言模型),以及代码、技巧

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

awesome-python-cn

Python资源大全中文版,包括:Web框架、网络爬虫、模板引擎、数据库、数据可视化、图片处理等,由「开源前哨」和「Python开发者」微信公号团队维护更新。

License:NOASSERTIONStargazers:0Issues:0Issues:0

NLP_ability

总结梳理自然语言处理工程师(NLP)需要积累的各方面知识,包括面试题,各种基础知识,工程能力等等,提升核心竞争力

Stargazers:0Issues:0Issues:0

WordSeg

A PyTorch implementation of a BiLSTM \ BERT \ Roberta (+ BiLSTM + CRF) model for Chinese Word Segmentation (中文分词) .

Stargazers:0Issues:0Issues:0