Grey Murav (gremur)

gremur

Geek Repo

Company:Data Diggers

Location:Russia

Github PK Tool:Github PK Tool

Grey Murav's starred repositories

spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Language:PythonLicense:MITStargazers:29875Issues:561Issues:5649

AI-Expert-Roadmap

Roadmap to becoming an Artificial Intelligence Expert in 2022

Language:JavaScriptLicense:MITStargazers:29100Issues:965Issues:64

gensim

Topic Modelling for Humans

Language:PythonLicense:LGPL-2.1Stargazers:15627Issues:429Issues:1849

sentence-transformers

State-of-the-Art Text Embeddings

Language:PythonLicense:Apache-2.0Stargazers:15095Issues:141Issues:2160

icecream

🍦 Never use print() to debug again.

Language:PythonLicense:MITStargazers:9019Issues:51Issues:135

pattern

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Language:PythonLicense:BSD-3-ClauseStargazers:8733Issues:543Issues:206

curlconverter

Transpile curl commands into Python, JavaScript and 27 other languages

Language:TypeScriptLicense:MITStargazers:7472Issues:74Issues:314

WantWords

An open-source online reverse dictionary.

Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

Language:JavaScriptLicense:Apache-2.0Stargazers:5789Issues:81Issues:163

flashtext

Extract Keywords from sentence or Replace keywords in sentences.

Language:PythonLicense:MITStargazers:5588Issues:142Issues:113

trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Language:PythonLicense:Apache-2.0Stargazers:3560Issues:30Issues:372

furl

🌐 URL parsing and manipulation made easy.

Language:PythonLicense:NOASSERTIONStargazers:2634Issues:37Issues:118

universal-data-tool

Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.

Language:JavaScriptLicense:MITStargazers:1949Issues:37Issues:273

prettytable

Display tabular data in a visually appealing ASCII table format

Language:PythonLicense:NOASSERTIONStargazers:1373Issues:24Issues:137

contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Language:PythonLicense:MITStargazers:1197Issues:17Issues:109

booknlp

BookNLP, a natural language processing pipeline for books

Language:PythonLicense:MITStargazers:787Issues:23Issues:24

streamlit_freecodecamp

Build 12 Data Apps in Python with Streamlit

Language:Jupyter NotebookStargazers:592Issues:9Issues:10

MarkTool

DoTAT 是一款基于web、面向领域的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持迭代标注、嵌套实体标注和嵌套事件标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,可对多人的标注结果进行一致性检验、自动合并和手动调整,提高了标注结果的准确率。

Language:VueLicense:Apache-2.0Stargazers:592Issues:13Issues:18

st-annotated-text

A simple component to display annotated text in Streamlit apps.

Language:PythonLicense:Apache-2.0Stargazers:518Issues:11Issues:30

snorkel-tutorials

A collection of tutorials for Snorkel

Language:PythonLicense:Apache-2.0Stargazers:389Issues:20Issues:53

preprocessor

Elegant and Easy Tweet Preprocessing in Python

Language:PythonLicense:GPL-3.0Stargazers:305Issues:10Issues:40

pyate

PYthon Automated Term Extraction

Language:HTMLLicense:MITStargazers:304Issues:15Issues:33

imagetagger

An open source online platform for collaborative image labeling

Language:HTMLLicense:MITStargazers:265Issues:21Issues:135

ru_number_to_text

Преобразует число в текст с учетом plural forms (сумма прописью python). Русский язык.

Language:PythonLicense:Apache-2.0Stargazers:137Issues:11Issues:5

spacy-annotator

Spacy NER annotator using ipywidgets

OpenLabeler

OpenLabeler is an open source desktop application for annotating objects for AI appplications

Language:JavaLicense:Apache-2.0Stargazers:113Issues:7Issues:22

ru_punct

Нейронная сеть для восстановления пунктуации на русском языке.

corpuscula

Toolkit that simplifies corpus processing

Language:PythonLicense:BSD-3-ClauseStargazers:3Issues:2Issues:0