Derek Willis (dwillis)

dwillis

Geek Repo

Company:@openelections

Home Page:http://thescoop.org/

Twitter:@derekwillis

Github PK Tool:Github PK Tool


Organizations
unitedstates

Derek Willis's starred repositories

generative-ai-for-beginners

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Language:Jupyter NotebookLicense:MITStargazers:56853Issues:483Issues:98

mlx

MLX: An array framework for Apple silicon

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9241Issues:79Issues:100

llama-fs

A self-organizing file system with llama 3

Language:Jupyter NotebookLicense:MITStargazers:4623Issues:32Issues:37

newscatcher

Programmatically collect normalized news from (almost) any website.

Language:PythonLicense:MITStargazers:2918Issues:71Issues:21

secret-llama

Fully private LLM chatbot that runs entirely with a browser with no server needed. Supports Mistral and LLama 3.

Language:TypeScriptLicense:Apache-2.0Stargazers:2361Issues:15Issues:24

sparrow

Data processing with ML and LLM

Language:PythonLicense:GPL-3.0Stargazers:2335Issues:36Issues:53

open-parse

Improved file parsing for LLM’s

Language:PythonLicense:MITStargazers:2145Issues:12Issues:26

news-please

news-please - an integrated web crawler and information extractor for news that just works

Language:PythonLicense:Apache-2.0Stargazers:1996Issues:53Issues:180

thepipe

Extract markdown and images from URLs, PDFs, docs, slides, and more, ready for multimodal LLMs. ⚡

Language:PythonLicense:MITStargazers:822Issues:8Issues:17

databonsai

clean & curate your data with LLMs.

Language:PythonLicense:MITStargazers:441Issues:2Issues:2

croissant

Croissant is a high-level format for machine learning datasets that brings together four rich layers.

Language:PythonLicense:Apache-2.0Stargazers:355Issues:22Issues:217

warc-gpt

WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.

Language:PythonLicense:MITStargazers:215Issues:12Issues:3

topfew

Finds the field values (or combinations of values) which appear most often in a stream of records.

Language:GoLicense:GPL-3.0Stargazers:180Issues:5Issues:15

flask-muck

🧹 Flask REST framework for generating CRUD APIs and OpenAPI specs in the SQLAlchemy, Marshmallow/Pydantic application stack.

Language:PythonLicense:MITStargazers:155Issues:3Issues:4

DatawRappr

R-Package to connect to the Datawrapper-API

Language:RLicense:MITStargazers:84Issues:4Issues:53

censusdis

censusdis is a package for discovering, loading, analyzing, and computing diversity, integration, and segregation metrics to U.S. Census demographic data. It is designed to be intuitive and Pythonic, but give users access to the full collection of data and maps the U.S. Census publishes via their APIs.

Language:PythonLicense:NOASSERTIONStargazers:53Issues:3Issues:65

textstem

Tools for fast text stemming & lemmatization

2024-openai-gpt-hiring-racial-discrimination

Data and materials to reproduce Bloomberg's investigation into racial and gender bias in OpenAI's GPT

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:37Issues:1Issues:0

whisper-audio-transcriber

Whisper Audio Transcriber: Streamlined tool for converting audio to text using the powerful Whisper ASR model. User-friendly and efficient.

Language:Jupyter NotebookStargazers:16Issues:1Issues:0

congress

Access the Congress.gov API

Language:RLicense:NOASSERTIONStargazers:11Issues:2Issues:18

course-materials

This is the course repository for the Spring 2024 iteration of MACS 30123 "Large-Scale Computing for the Social Sciences" at the University of Chicago.

Language:Jupyter NotebookLicense:MITStargazers:9Issues:0Issues:0

plotting-county-election-results

🇺🇸🏁 Draw a beautiful county-level election results map with only a few lines of code

Language:RStargazers:9Issues:3Issues:0

tulsa-1921-data

Data files associated with our story on the 1921 race massacre in Tulsa, Oklahoma.

License:NOASSERTIONStargazers:9Issues:0Issues:0

stats-notes

Notes for teaching statistics

Language:TeXStargazers:6Issues:2Issues:0

data-institute-2023

Materials used to teach the 2023 Data Institute.

Stargazers:4Issues:0Issues:0
Language:PythonLicense:MITStargazers:3Issues:0Issues:0

cpsvote

R interface for the Current Population Survey (CPS) Voting and Registration Supplement

Language:HTMLLicense:NOASSERTIONStargazers:3Issues:1Issues:27
Language:Jupyter NotebookStargazers:3Issues:0Issues:0

rpvnearme

Replication for rpvnearme.org, a project of the Election Law Clinic at Harvard Law School

Language:RLicense:NOASSERTIONStargazers:2Issues:0Issues:0