Xin Zhang (izhx)

izhx

Geek Repo

Company:PhD student @ HITsz

Location:China

Home Page:izhx.github.io

Twitter:@xinzhangai

Github PK Tool:Github PK Tool

Xin Zhang's starred repositories

al-folio

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:HTMLLicense:MITStargazers:9730Issues:24Issues:519

mistral-src

Reference implementation of Mistral AI 7B v0.1 model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:8772Issues:116Issues:115

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:7212Issues:113Issues:148

outlines

Structured Text Generation

Language:PythonLicense:Apache-2.0Stargazers:6805Issues:45Issues:491

hatch

Modern, extensible Python project management

Language:PythonLicense:MITStargazers:5610Issues:50Issues:656

composer

Supercharge Your Model Training

Language:PythonLicense:Apache-2.0Stargazers:5056Issues:51Issues:529

agents

An Open-source Framework for Autonomous Language Agents

Language:PythonLicense:Apache-2.0Stargazers:4660Issues:59Issues:68

OpenAgents

OpenAgents: An Open Platform for Language Agents in the Wild

Language:PythonLicense:Apache-2.0Stargazers:3708Issues:40Issues:97

Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2509Issues:37Issues:98

text-embeddings-inference

A blazing fast inference solution for text embeddings models

Language:RustLicense:Apache-2.0Stargazers:2207Issues:27Issues:177

yarn

YaRN: Efficient Context Window Extension of Large Language Models

Language:PythonLicense:MITStargazers:1233Issues:14Issues:54

streaming

A Data Streaming Library for Efficient Neural Network Training

Language:PythonLicense:Apache-2.0Stargazers:1000Issues:20Issues:138

anserini

Anserini is a Lucene toolkit for reproducible information retrieval research

Language:JavaLicense:Apache-2.0Stargazers:993Issues:41Issues:603

agent-protocol

Common interface for interacting with AI agents. The protocol is tech stack agnostic - you can use it with any framework for building agents.

Language:PythonLicense:MITStargazers:834Issues:11Issues:39

nlp-phd-global-equality

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Generative-AI

[TPAMI 2023] Multimodal Image Synthesis and Editing: The Generative AI Era

vec2text

utilities for decoding deep representations (like sentence embeddings) back to text

Language:PythonLicense:NOASSERTIONStargazers:632Issues:13Issues:38

fairseq2

FAIR Sequence Modeling Toolkit 2

Language:PythonLicense:MITStargazers:609Issues:18Issues:88

examples

Fast and flexible reference benchmarks

Language:ShellLicense:Apache-2.0Stargazers:423Issues:16Issues:39

GPT-Fathom

GPT-Fathom is an open-source and reproducible LLM evaluation suite, benchmarking 10+ leading open-source and closed-source LLMs as well as OpenAI's earlier models on 20+ curated benchmarks under aligned settings.

Language:PythonLicense:MITStargazers:344Issues:1Issues:6

DreamLLM

[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation

Language:PythonLicense:Apache-2.0Stargazers:341Issues:18Issues:20

rerope

Rectified Rotary Position Embeddings

Gentopia

Build Hierarchical Autonomous Agents through Config. Collaborative Growth of Specialized Agents.

Language:PythonLicense:MITStargazers:282Issues:2Issues:5

CapsFusion

[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale

FlagAttention

A collection of memory efficient attention operators implemented in the Triton language.

Language:PythonLicense:NOASSERTIONStargazers:171Issues:6Issues:2

multires-conv

Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)

Language:PythonLicense:MITStargazers:118Issues:6Issues:4

swim-ir

SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.

ChatIR

Official repository of "Chatting Makes Perfect: Chat-based Image Retrieval"

Language:PythonLicense:MITStargazers:16Issues:3Issues:4

Visual-C3

Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters

CLEME

The repository of EMNLP 2023 "CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction"

Language:PythonLicense:Apache-2.0Stargazers:8Issues:0Issues:0