Jonathan Zhouhan LIN (hantek)

hantek

Geek Repo

Company:Facebook; University de Montreal, Harbin Institute of Technology

Location:Menlo Park, California, US

Home Page:hantek.github.io

Github PK Tool:Github PK Tool


Organizations
mila-iqia

Jonathan Zhouhan LIN's starred repositories

COAT

A CommonSense Reasoning Dataset pertaining to Physical Commonsense affordance of objects.

Stargazers:8Issues:0Issues:0

one-api

OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.

Language:JavaScriptLicense:MITStargazers:18335Issues:0Issues:0
License:Apache-2.0Stargazers:226Issues:0Issues:0

ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Language:PythonLicense:Apache-2.0Stargazers:6837Issues:0Issues:0

video-subtitle-extractor

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Language:PythonLicense:Apache-2.0Stargazers:5855Issues:0Issues:0

unarXive

A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network

Language:PythonLicense:MITStargazers:258Issues:0Issues:0

k2

Code and datasets for paper "K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization" in WSDM-2024

Language:PythonLicense:Apache-2.0Stargazers:168Issues:0Issues:0

faiss

A library for efficient similarity search and clustering of dense vectors.

Language:C++License:MITStargazers:30845Issues:0Issues:0

SnakeGame

A Snake Game in c++ and Qt for cs1605 homework

Language:C++Stargazers:2Issues:0Issues:0

long_tail_knowledge

Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"

Language:PythonStargazers:72Issues:0Issues:0

EasySpider

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

Language:JavaScriptLicense:NOASSERTIONStargazers:34835Issues:0Issues:0

ProteinMPNN

Code for the ProteinMPNN paper

Language:Jupyter NotebookLicense:MITStargazers:968Issues:0Issues:0

CBook-150K

中文图书语料MD5链接

Language:PythonLicense:Apache-2.0Stargazers:210Issues:0Issues:0

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Language:PythonLicense:MITStargazers:23596Issues:0Issues:0

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonLicense:Apache-2.0Stargazers:18255Issues:0Issues:0

wikiextractor

A tool for extracting plain text from Wikipedia dumps

Language:PythonLicense:AGPL-3.0Stargazers:3742Issues:0Issues:0

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:10461Issues:0Issues:0

BELLE

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Language:HTMLLicense:Apache-2.0Stargazers:7851Issues:0Issues:0

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

License:MITStargazers:3427Issues:0Issues:0

GLM-130B

GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

Language:PythonLicense:Apache-2.0Stargazers:7656Issues:0Issues:0

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38715Issues:0Issues:0

SurviveSJTUManual

更新2008年版本的《上海交通大学生存手册》gitbook发布于https://survivesjtu.gitbook.io/survivesjtumanual/

Stargazers:3871Issues:0Issues:0

HFL-Anthology

Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL)

Language:MarkdownLicense:CC-BY-SA-4.0Stargazers:361Issues:0Issues:0

vimrc

The ultimate Vim configuration (vimrc)

Language:Vim ScriptLicense:MITStargazers:30638Issues:0Issues:0

AVSU-VIPL

Collection of works from VIPL-AVSU

Stargazers:40Issues:0Issues:0

prefix-beam-search

Code for prefix beam search tutorial by @labodk

Language:PythonStargazers:185Issues:0Issues:0

ROCm

AMD ROCm™ Software - GitHub Home

Language:ShellLicense:MITStargazers:4550Issues:0Issues:0

Artificial-Intelligence-Terminology-Database

A comprehensive mapping database of English to Chinese technical vocabulary in the artificial intelligence domain

License:NOASSERTIONStargazers:1897Issues:0Issues:0

slp2-pdf

Speech and Language Processing, 2nd Edition in PDF format

Stargazers:422Issues:0Issues:0