Qingyun Wang (EagleW)

EagleW

Geek Repo

Company:University of Illinois at Urbana-Champaign

Location:Champaign, Illinois

Home Page:EagleW.github.io

Twitter:@eagle_hz

Github PK Tool:Github PK Tool

Qingyun Wang's starred repositories

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:33957Issues:341Issues:2650

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonLicense:MITStargazers:14662Issues:129Issues:605

MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Language:PythonLicense:Apache-2.0Stargazers:10888Issues:112Issues:663

llama-recipes

Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:7850Issues:68Issues:227

exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Language:PythonLicense:MITStargazers:3278Issues:33Issues:366

Promptify

Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3132Issues:47Issues:67

esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins

Language:PythonLicense:MITStargazers:3023Issues:65Issues:318

meditron

Meditron is a suite of open-source medical Large Language Models (LLMs).

Language:PythonLicense:Apache-2.0Stargazers:1779Issues:30Issues:29

ProtTrans

ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.

Language:Jupyter NotebookLicense:AFL-3.0Stargazers:1061Issues:32Issues:145

lingua-py

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

Language:PythonLicense:Apache-2.0Stargazers:1031Issues:12Issues:77

dolma

Data and tools for generating and inspecting OLMo pre-training data.

Language:PythonLicense:Apache-2.0Stargazers:857Issues:17Issues:66

KnowledgeEditingPapers

[知识编辑] Must-read Papers on Knowledge Editing for Large Language Models.

progen

Official release of the ProGen models

Language:PythonLicense:BSD-3-ClauseStargazers:590Issues:18Issues:43

Megatron-LLM

distributed trainer for LLMs

Language:PythonLicense:NOASSERTIONStargazers:503Issues:18Issues:57

gt4sd-core

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

Language:Jupyter NotebookLicense:MITStargazers:325Issues:17Issues:99

UltraFeedback

A large-scale, fine-grained, diverse preference dataset (and models).

Language:PythonLicense:MITStargazers:284Issues:10Issues:13

hetionet

Hetionet: an integrative network of disease

MetaICL

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

Language:PythonLicense:NOASSERTIONStargazers:245Issues:10Issues:21

enzynet

EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation

Language:PythonLicense:MITStargazers:199Issues:15Issues:15

tart

Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.

Language:PythonLicense:NOASSERTIONStargazers:156Issues:8Issues:12

gpqa

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Language:Jupyter NotebookLicense:MITStargazers:117Issues:4Issues:12

Generative_KG_Construction_Papers

[EMNLP 2022] Generative Knowledge Graph Construction: A Review

License:MITStargazers:101Issues:6Issues:0
Language:PythonLicense:Apache-2.0Stargazers:71Issues:5Issues:4
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:48Issues:6Issues:6

CODA-19

This is the Github repo of "CODA-19: Using a Non-Expert Crowd to Annotate Research Aspects on 10,000+ Abstracts in the COVID-19 Open Research Dataset" (https://arxiv.org/abs/2005.02367)

Language:PythonStargazers:36Issues:5Issues:0

csfaculty.github.io

Interview questions for Computer Science faculty jobs

enzyme-datasets

Enzyme datasets used to benchmark enzyme-substrate promiscuity models

Language:PythonStargazers:28Issues:5Issues:0

Megatron-LLM

distributed trainer for LLMs

Language:PythonLicense:NOASSERTIONStargazers:3Issues:0Issues:0

rufes

Scripts for supporting TAC KBP Recognizing Ultra Fine-grained EntitieS Task (RUFES)

Language:PythonStargazers:2Issues:0Issues:0