yichen2017

yichen2017

Geek Repo

Github PK Tool:Github PK Tool

yichen2017's starred repositories

excalidraw

Virtual whiteboard for sketching hand-drawn like diagrams

Language:TypeScriptLicense:MITStargazers:78345Issues:397Issues:3405

milvus

A cloud-native vector database, storage for next generation AI applications

Language:GoLicense:Apache-2.0Stargazers:28684Issues:274Issues:11369

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:28383Issues:187Issues:4469

xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Language:C++License:Apache-2.0Stargazers:25921Issues:912Issues:5235

Dive-into-DL-PyTorch

本项目将《动手学深度学习》(Dive into Deep Learning)原书中的MXNet实现改为PyTorch实现。

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:18080Issues:387Issues:149

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:15154Issues:63Issues:183

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Language:PythonLicense:Apache-2.0Stargazers:14581Issues:135Issues:2079

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:13345Issues:87Issues:853

leedl-tutorial

《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:11503Issues:264Issues:82

metersphere

MeterSphere 是新一代的开源持续测试工具,让软件测试工作更简单、更高效,不再成为持续交付的瓶颈。

Language:JavaLicense:GPL-3.0Stargazers:11324Issues:186Issues:9692

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9410Issues:78Issues:110

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonLicense:MITStargazers:8570Issues:66Issues:200

pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Language:PythonLicense:MITStargazers:6141Issues:91Issues:543

Lhy_Machine_Learning

李宏毅2021/2022/2023春季机器学习课程课件及作业

Language:Jupyter NotebookStargazers:5906Issues:48Issues:13

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonLicense:MITStargazers:5589Issues:46Issues:291

sahi

Framework agnostic sliced/tiled inference + interactive ui + error analysis plots

Language:PythonLicense:MITStargazers:3854Issues:41Issues:0

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Language:PythonLicense:Apache-2.0Stargazers:3749Issues:25Issues:57

pdf2docx

Open source Python library for converting PDF to DOCX.

Language:PythonLicense:AGPL-3.0Stargazers:2383Issues:24Issues:241

MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Language:PythonLicense:NOASSERTIONStargazers:2096Issues:45Issues:144

Pix2Text

An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.

Language:Jupyter NotebookLicense:MITStargazers:1656Issues:14Issues:70

meet-libai

​ 李白 :bust_in_silhouette: 作为唐代杰出诗人,其诗歌作品在**文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。

Language:PythonLicense:GPL-3.0Stargazers:1015Issues:5Issues:3

AutoCoder

We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.

Language:PythonLicense:Apache-2.0Stargazers:772Issues:14Issues:12

DocumentLayoutAnalysis

Document Layout Analysis resources repos for development with PdfPig.

haupt

Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon

Language:PythonLicense:AGPL-3.0Stargazers:453Issues:37Issues:0

MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

Language:PythonLicense:Apache-2.0Stargazers:365Issues:4Issues:9

PointTransformerV2

[NeurIPS'22] An official PyTorch implementation of PTv2.

point2cad

Code for "Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds"

Language:PythonLicense:Apache-2.0Stargazers:218Issues:12Issues:10

RapidStructure

版面分析 | 表格识别 | 文档方向分类

Language:PythonLicense:Apache-2.0Stargazers:165Issues:6Issues:14

Docs2KG

Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models

Language:PythonLicense:LGPL-2.1Stargazers:162Issues:4Issues:32