EPIC Data Lab (ucbepic)

EPIC Data Lab

ucbepic

Organization data from Github https://github.com/ucbepic

Effective Programming Interaction and Computation with Data

Location:United States of America

Home Page:https://epic.berkeley.edu

GitHub:@ucbepic

Twitter:@ucbepic

EPIC Data Lab's repositories

docetl

A system for agentic LLM-powered data processing and ETL

Language:PythonLicense:MITStargazers:3035Issues:26Issues:134

TWIX

TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by inferring the shared underlying visual template across documents

BARGAIN

Low-Cost LLM-Powered Data Processing with Theoretical Guarantees

Language:PythonLicense:MITStargazers:28Issues:0Issues:0

data-agent-benchmark-study

Welcoming contributions from practitioners building AI/data systems - share your real-world problems, document where current tools fail, and help improve the benchmark taxonomy across the enterprise data categories.

Language:PythonStargazers:12Issues:0Issues:0

pdf_parser

Parse PDFs using computer vision, layout analysis, and other state-of-the-art document intelligence techniques. WebApp implemented in Flask/Jinja2 with infer and train pipelines managed by FlorDB

Language:JavaScriptLicense:Apache-2.0Stargazers:9Issues:10Issues:0

docetl-examples

Examples of docetl pipelines

Language:PythonStargazers:2Issues:8Issues:0

ml_tutorial

Introduction to Flordb with PyTorch and TensorFlow

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:6Issues:0