lazyc81

lazyc81

Geek Repo

0

followers

0

following

Github PK Tool:Github PK Tool

lazyc81's starred repositories

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Language:PythonLicense:AGPL-3.0Stargazers:2550Issues:0Issues:0

pandoc-latex-template

A pandoc LaTeX template to convert markdown files to PDF or LaTeX.

Language:TeXLicense:BSD-3-ClauseStargazers:5988Issues:0Issues:0

img2latex-mathpix

Mathpix has changed their billing policy and no longer has free monthly API requests. This repo is now archived and will not receive any updates for the foreseeable future.

Language:JavaLicense:Apache-2.0Stargazers:1380Issues:0Issues:0

Pix2Text

An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.

Language:Jupyter NotebookLicense:MITStargazers:1644Issues:0Issues:0

open-parse

Improved file parsing for LLM’s

Language:PythonLicense:MITStargazers:2189Issues:0Issues:0

rebased

Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"

Language:PythonLicense:Apache-2.0Stargazers:137Issues:0Issues:0

zzzArchived_arxiv-readability

Pilot project to render HTML5 from arXiv LaTeX sources

Language:PythonLicense:MITStargazers:109Issues:0Issues:0

synctex

Synchronization for TeX

Language:TeXLicense:MITStargazers:59Issues:0Issues:0

ar5ivist

A turnkey command for converting a LaTeX source to ar5iv-style HTML

Language:DockerfileLicense:MITStargazers:51Issues:0Issues:0

Qwen-Agent

Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Language:PythonLicense:NOASSERTIONStargazers:2765Issues:0Issues:0

MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Language:PythonLicense:MITStargazers:2460Issues:0Issues:0

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:1169Issues:0Issues:0

based

Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"

Language:PythonLicense:Apache-2.0Stargazers:196Issues:0Issues:0

pdftotree

:evergreen_tree: A tool for converting PDF into hOCR with text, tables, and figures being recognized and preserved.

Language:PythonLicense:MITStargazers:416Issues:0Issues:0

Yi

A series of large language models trained from scratch by developers @01-ai

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7516Issues:0Issues:0

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:ShellStargazers:6537Issues:0Issues:0

txtai

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Language:PythonLicense:Apache-2.0Stargazers:8155Issues:0Issues:0

deepdoctection

A Repo For Document AI

Language:PythonLicense:Apache-2.0Stargazers:2401Issues:0Issues:0

nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.

Language:PythonLicense:Apache-2.0Stargazers:960Issues:0Issues:0

llmsherpa

Developer APIs to Accelerate LLM Projects

Language:Jupyter NotebookLicense:MITStargazers:1218Issues:0Issues:0

Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

Language:JavaScriptLicense:Apache-2.0Stargazers:5725Issues:0Issues:0

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:14857Issues:0Issues:0

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language:HTMLLicense:Apache-2.0Stargazers:7805Issues:0Issues:0

mega

Sequence modeling with Mega.

Language:PythonLicense:MITStargazers:296Issues:0Issues:0
Language:PythonStargazers:65Issues:0Issues:0
Language:PythonStargazers:28Issues:0Issues:0

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Language:PythonLicense:Apache-2.0Stargazers:6732Issues:0Issues:0

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:8106Issues:0Issues:0

papermage

library supporting NLP and CV research on scientific papers

Language:PythonLicense:Apache-2.0Stargazers:652Issues:0Issues:0

awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案

License:MITStargazers:1096Issues:0Issues:0