Ludwing's starred repositories

build-your-own-x

Master programming by recreating your favorite technologies from scratch.

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

Language:LLVMLicense:NOASSERTIONStargazers:28577Issues:585Issues:77042

pydantic

Data validation using Python type hints

Language:PythonLicense:MITStargazers:20841Issues:117Issues:4563

clash-rules

🦄️ 🎃 👻 Clash Premium 规则集(RULE-SET),兼容 ClashX Pro、Clash for Windows 等基于 Clash Premium 内核的客户端。

PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

Language:PythonLicense:Apache-2.0Stargazers:12048Issues:105Issues:3596

grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.

Language:ANTLRLicense:MITStargazers:10145Issues:227Issues:1447

doccano

Open source annotation tool for machine learning practitioners.

Language:PythonLicense:MITStargazers:9501Issues:133Issues:1524

cozo

A transactional, relational-graph-vector database that uses Datalog for query. The hippocampus for AI!

Language:RustLicense:MPL-2.0Stargazers:3375Issues:42Issues:145

roapi

Create full-fledged APIs for slowly moving datasets without writing a single line of code.

Language:RustLicense:Apache-2.0Stargazers:3198Issues:43Issues:157

pypika

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.

Language:PythonLicense:Apache-2.0Stargazers:2507Issues:36Issues:433

projects

🪐 End-to-end NLP workflows from prototype to production

Language:PythonLicense:MITStargazers:1315Issues:32Issues:0

souffle

Soufflé is a variant of Datalog for tool designers crafting analyses in Horn clauses. Soufflé synthesizes a native parallel C++ program from a logic specification.

Language:C++License:UPL-1.0Stargazers:915Issues:41Issues:855

text_analysis_tools

中文文本分析工具包(包括- 文本分类 - 文本聚类 - 文本相似性 - 关键词抽取 - 关键短语抽取 - 情感分析 - 文本纠错 - 文本摘要 - 主题关键词-同义词、近义词-事件三元组抽取)

Language:PythonLicense:Apache-2.0Stargazers:675Issues:8Issues:5

MarkTool

DoTAT 是一款基于web、面向领域的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持迭代标注、嵌套实体标注和嵌套事件标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,可对多人的标注结果进行一致性检验、自动合并和手动调整,提高了标注结果的准确率。

Language:VueLicense:Apache-2.0Stargazers:592Issues:13Issues:18

graphql-compiler

Turn complex GraphQL queries into optimized database queries.

Language:PythonLicense:Apache-2.0Stargazers:552Issues:24Issues:166

rule-engine

A lightweight, optionally typed expression language with a custom grammar for matching arbitrary Python objects.

Language:PythonLicense:BSD-3-ClauseStargazers:461Issues:7Issues:66

OmniEvent

A comprehensive, unified and modular event extraction toolkit.

Language:PythonLicense:MITStargazers:342Issues:10Issues:29

problog

ProbLog is a Probabilistic Logic Programming Language for logic programs with probabilities.

awesome-ontology

A curated list of ontology things

Web-crawler

调研药品数据网站。基于网络爬虫爬取药源网药物数据,搭建药品数据库。含中成药和化学药品信息共计10万余条。爬取国家食品药品监督管理局药品数据对药源网数据进行修正。基于Selenium等工具应对反爬,爬取ICD10等数据共研究使用。

my-bookshelf

Collection of books/papers that I've read/I'm going to read/I would remember that they exist/It is unlikely that I'll read/I'll never read.

Language:HTMLLicense:MITStargazers:74Issues:6Issues:0

radb

RA (radb): A relational algebra interpreter over relational databases

Language:PythonLicense:NOASSERTIONStargazers:62Issues:10Issues:6

Elasticsearch-7.0-Cookbook

Elasticsearch 7.0 Cookbook, Fourth-Edition, published by packt publishing

Language:ShellLicense:MITStargazers:54Issues:6Issues:2

nmpa-data

国家药监局药品数据

NER-RE

A Named Entity Recognition + Entity Linker + Relation Extraction Pipeline built using spacy v3. Given a text, the pipeline will extract entities from the text as trained and will disambiguate the entities to its normalized form through an Entity Linker connected to a Knowledge Base and will assign a relation between the entities, if any.

Jena-Fuseki-Reasoner-Inference

A test project enabling inference in Apache Jena Fuseki 2.4.1 with Jena Rules, RDFS (Entailment Regimes) and OWL

ensemble

Combine models, easily. 🚀

Language:PythonLicense:MITStargazers:23Issues:2Issues:19

Prescription-understanding

基于正则表达式和AC自动机多模匹配进行不规则处方文本理解,识别药品名、给药总量、用法用量等目标内容。

Language:PythonStargazers:15Issues:1Issues:0

automatic-api

A list of software that turns your database into a REST/GraphQL API

Language:GoStargazers:3Issues:4Issues:0