Heng-Shiou Sheu (Heng-xiu)

Heng-xiu

Geek Repo

Company:@FCU-IoT-Homework

Location:Taiwan

Home Page:http://hengxiuxu.blogspot.tw/

Github PK Tool:Github PK Tool

Heng-Shiou Sheu's starred repositories

flores

The FLORES+ Machine Translation Benchmark

Language:TeXLicense:CC-BY-SA-4.0Stargazers:65Issues:0Issues:0
Language:PythonStargazers:9Issues:0Issues:0

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:18583Issues:0Issues:0

Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files

Language:JavaLicense:GPL-3.0Stargazers:28385Issues:0Issues:0

mtdata

A tool that locates, downloads, and extracts machine translation corpora

Language:PythonLicense:Apache-2.0Stargazers:139Issues:0Issues:0

Chinese_spelling_Correction

Chinese Grammar Error and Spelling Error Correction System - 中文文法錯誤及錯別字校正系統

Language:Jupyter NotebookStargazers:4Issues:0Issues:0

AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

Language:C++License:Apache-2.0Stargazers:1068Issues:0Issues:0

Large_dataset_translator

Translate large dataset to any language with google translation api and multithread processing, no key required !

Language:PythonLicense:Apache-2.0Stargazers:41Issues:0Issues:0

Adaptive-MT-LLM-Fine-tuning

Fine-tuning Mistral LLM for Adaptive Machine Translation

Language:Jupyter NotebookStargazers:44Issues:0Issues:0

Whisper

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

Language:C++License:MPL-2.0Stargazers:7464Issues:0Issues:0

mt-metrics-eval

Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.

Language:PythonLicense:Apache-2.0Stargazers:72Issues:0Issues:0

bitextor

Bitextor generates translation memories from multilingual websites

Language:PythonLicense:GPL-3.0Stargazers:283Issues:0Issues:0

VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Language:PythonLicense:MITStargazers:508Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:540Issues:0Issues:0

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookLicense:MITStargazers:8231Issues:0Issues:0

transformers_tasks

⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.

Language:Jupyter NotebookStargazers:2018Issues:0Issues:0

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonLicense:MITStargazers:8492Issues:0Issues:0

aya-annotations-ui

Web UI & Backend for Data Annotations in Aya

Language:PythonLicense:Apache-2.0Stargazers:25Issues:0Issues:0

HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Language:PythonLicense:Apache-2.0Stargazers:32717Issues:0Issues:0

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:5989Issues:0Issues:0

how-to-train-tokenizer

怎么训练一个LLM分词器

Language:PythonStargazers:102Issues:0Issues:0

MoneyPrinter

Automate Creation of YouTube Shorts using MoviePy.

Language:PythonLicense:MITStargazers:9748Issues:0Issues:0

ALMA

State-of-the-art LLM-based translation models.

Language:RubyLicense:MITStargazers:332Issues:0Issues:0

OneRingTranslator

Simple REST service to translate texts. Plugins. Automatic calculate BLEU/COMET metrics of translation quality.

Language:PythonLicense:MITStargazers:97Issues:0Issues:0

self-translate

Do Multilingual Language Models Think Better in English?

Language:Jupyter NotebookLicense:MITStargazers:35Issues:0Issues:0

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

License:CC-BY-4.0Stargazers:2791Issues:0Issues:0

inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Language:PythonLicense:Apache-2.0Stargazers:3016Issues:0Issues:0

xorbits

Scalable Python DS & ML, in an API compatible & lightning fast way.

Language:PythonLicense:Apache-2.0Stargazers:1033Issues:0Issues:0

augmentoolkit

Convert Compute And Books Into Instruct-Tuning Datasets

Language:PythonLicense:MITStargazers:500Issues:0Issues:0

llama.cpp

LLM inference in C/C++

Language:C++License:MITStargazers:59711Issues:0Issues:0