tsinghuald

tsinghuald

Geek Repo

Github PK Tool:Github PK Tool

tsinghuald's starred repositories

asitop

Perf monitoring CLI tool for Apple Silicon

Language:PythonLicense:MITStargazers:3278Issues:0Issues:0

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Language:PythonLicense:AGPL-3.0Stargazers:9359Issues:0Issues:0

Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files

Language:JavaLicense:GPL-3.0Stargazers:38825Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4396Issues:0Issues:0

omniparse

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

Language:PythonLicense:GPL-3.0Stargazers:4762Issues:0Issues:0

phonemizer

Simple text to phones converter for multiple languages

Language:PythonLicense:GPL-3.0Stargazers:1179Issues:0Issues:0

podcast-namespace

A wholistic rss namespace for podcasting

Language:HTMLLicense:CC0-1.0Stargazers:376Issues:0Issues:0

aimoneyhunter

ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English version for more insights.

Stargazers:12787Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:10584Issues:0Issues:0

aTrain

A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.

Language:PythonLicense:NOASSERTIONStargazers:275Issues:0Issues:0

more-ane-transformers

Run transformers (incl. LLMs) on the Apple Neural Engine.

Language:PythonStargazers:51Issues:0Issues:0

coreml-llm-cli

CLI to demonstrate running a large language model (LLM) on Apple Neural Engine.

Language:SwiftStargazers:38Issues:0Issues:0

Metal-Guide

Metal Guide

Language:SwiftStargazers:76Issues:0Issues:0

metal-flash-attention

FlashAttention (Metal Port)

Language:SwiftLicense:MITStargazers:340Issues:0Issues:0

open-source-mac-os-apps

🚀 Awesome list of open source applications for macOS. https://t.me/s/opensourcemacosapps

License:CC0-1.0Stargazers:40838Issues:0Issues:0

piper

A fast, local neural text to speech system

Language:C++License:MITStargazers:5540Issues:0Issues:0

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:14593Issues:0Issues:0

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:15004Issues:0Issues:0

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:29549Issues:0Issues:0

whisper-plus

WhisperPlus: Faster, Smarter, and More Capable 🚀

Language:PythonLicense:Apache-2.0Stargazers:1625Issues:0Issues:0

neural-engine

Everything we actually know about the Apple Neural Engine (ANE)

License:MITStargazers:1985Issues:0Issues:0

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:34992Issues:0Issues:0

tinydiarize

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Language:PythonLicense:MITStargazers:407Issues:0Issues:0

buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

Language:PythonLicense:MITStargazers:11665Issues:0Issues:0

WhisperKit

On-device Speech Recognition for Apple Silicon

Language:SwiftLicense:MITStargazers:3060Issues:0Issues:0

whisper.coreml

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:16Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Language:PythonLicense:MITStargazers:5158Issues:0Issues:0

llm-action

本项目旨在分享大模型相关技术原理以及实战经验。

Language:HTMLLicense:Apache-2.0Stargazers:8674Issues:0Issues:0

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:15815Issues:0Issues:0

PPO-for-Beginners

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Language:PythonLicense:MITStargazers:705Issues:0Issues:0