Patrick Barker (pbarker)

pbarker

Geek Repo

Location:Boulder, CO

Github PK Tool:Github PK Tool


Organizations
aunum

Patrick Barker's starred repositories

BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Language:PythonLicense:MITStargazers:1501Issues:0Issues:0

LLM4Teach

Python code to implement LLM4Teach, a policy distillation approach for teaching reinforcement learning agents with Large Language Model

Language:PythonStargazers:20Issues:0Issues:0

rund

OCI Container Runtime for Darwin

Language:GoLicense:Apache-2.0Stargazers:441Issues:0Issues:0

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:PythonStargazers:602Issues:0Issues:0

chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Language:PythonLicense:NOASSERTIONStargazers:1688Issues:0Issues:0
Language:PythonLicense:MITStargazers:85Issues:0Issues:0

sqlite-vec

A vector search SQLite extension that runs anywhere!

Language:CLicense:Apache-2.0Stargazers:3277Issues:0Issues:0

sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!

Language:C++License:MITStargazers:1656Issues:0Issues:0

pyvecdb

A Python library for efficient similarity search using high-dimensional vectors.

Language:PythonStargazers:2Issues:0Issues:0

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonLicense:MITStargazers:5919Issues:0Issues:0

agenata

Build Web Datasets with Ease

Language:JavaScriptStargazers:32Issues:0Issues:0

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:1507Issues:0Issues:0
Language:PythonLicense:MITStargazers:533Issues:0Issues:0

Phi3-Vision-Finetune

An open-source implementaion for fine-tuning Phi3-Vision-128k-insturct by Microsoft.

Language:PythonLicense:Apache-2.0Stargazers:36Issues:0Issues:0

lumos

Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"

Language:PythonLicense:MITStargazers:433Issues:0Issues:0

digirl

Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.

Language:PythonLicense:Apache-2.0Stargazers:186Issues:0Issues:0

Semantic-SAM

[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"

Language:PythonStargazers:2208Issues:0Issues:0
Language:PythonStargazers:69Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9554Issues:0Issues:0

ruff

An extremely fast Python linter and code formatter, written in Rust.

Language:RustLicense:MITStargazers:29945Issues:0Issues:0

awesome-ai-agents

A list of AI autonomous agents

License:NOASSERTIONStargazers:9248Issues:0Issues:0

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9565Issues:0Issues:0

datamodel-code-generator

Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

Language:PythonLicense:MITStargazers:2547Issues:0Issues:0

ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 60+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonLicense:Apache-2.0Stargazers:2953Issues:0Issues:0

screen2words

The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of the screen2words models (our paper accepted by UIST'21 will be linked soon).

Stargazers:43Issues:0Issues:0

widget-caption

The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget captioning model (please see the EMNLP'20 paper: https://arxiv.org/abs/2010.04295).

Stargazers:16Issues:0Issues:0

taperception

This repository contains the datasets that were used for the research described in "Predicting and Explaining Mobile UI Tappability with Vision Modeling and Saliency Analysis" by Eldon Schoop, Xin Zhou, Gang Li, Zhourong Chen, Bjoern Hartmann and Yang Li, which is to appear in CHI 2022.

Stargazers:5Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:8266Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3136Issues:0Issues:0

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:7347Issues:0Issues:0