Ethan Wang's starred repositories

tesseract

Tesseract Open Source OCR Engine (main repository)

Language:C++License:Apache-2.0Stargazers:62058Issues:1689Issues:2651

styleguide

Style guides for Google-originated open-source projects

Language:HTMLLicense:Apache-2.0Stargazers:37421Issues:1277Issues:339

HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Language:PythonLicense:Apache-2.0Stargazers:33816Issues:1139Issues:1409

faiss

A library for efficient similarity search and clustering of dense vectors.

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonLicense:Apache-2.0Stargazers:25970Issues:201Issues:4158

minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Language:PythonLicense:MITStargazers:20076Issues:257Issues:72

crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

Language:PythonLicense:MITStargazers:17135Issues:206Issues:623

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:13879Issues:103Issues:1051

pytorch-deep-learning

Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.

Language:Jupyter NotebookLicense:MITStargazers:10862Issues:112Issues:204

GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Language:PythonLicense:MITStargazers:7201Issues:58Issues:172

pytesseract

A Python wrapper for Google Tesseract

Language:PythonLicense:Apache-2.0Stargazers:5831Issues:110Issues:362

curriculum

📚Open Source Curriculum for CNCF Certification Courses

torchtune

PyTorch native finetuning library

Language:PythonLicense:BSD-3-ClauseStargazers:4240Issues:47Issues:673

dlrm

An implementation of a deep learning recommendation model (DLRM)

Language:PythonLicense:MITStargazers:3751Issues:107Issues:213

winutils

Windows binaries for Hadoop versions (built from the git commit ID used for the ASF relase)

Language:ShellLicense:Apache-2.0Stargazers:2537Issues:162Issues:0

know-your-http-well

HTTP headers, media-types, methods, relations and status codes, all summarized and linking to their specification.

Language:Emacs LispLicense:UnlicenseStargazers:2396Issues:73Issues:37

winutils

winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows

mteb

MTEB: Massive Text Embedding Benchmark

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1918Issues:15Issues:444

underthesea

Underthesea - Vietnamese NLP Toolkit

Language:PythonLicense:GPL-3.0Stargazers:1399Issues:78Issues:254

pyspark-style-guide

This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.

Language:PythonLicense:MITStargazers:1046Issues:237Issues:5

pythainlp

Thai Natural Language Processing in Python.

Language:PythonLicense:Apache-2.0Stargazers:983Issues:47Issues:356

ckip-transformers

CKIP Transformers

Language:PythonLicense:GPL-3.0Stargazers:691Issues:14Issues:32

jobs

台灣 GO 語言招募職缺列表

PhoBERT

PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)

fipy

FiPy is a Finite Volume PDE solver written in Python

Language:PythonLicense:NOASSERTIONStargazers:509Issues:30Issues:780

LLM-FineTuning-Large-Language-Models

LLM (Large Language Model) FineTuning

Language:Jupyter NotebookStargazers:460Issues:9Issues:2

pyvi

Python Vietnamese Core NLP Toolkit

Language:Jupyter NotebookLicense:MITStargazers:245Issues:12Issues:11
Language:PythonLicense:NOASSERTIONStargazers:225Issues:7Issues:5