scroobiustrip

scroobiustrip

Geek Repo

Location:London, UK

Home Page:https://www.oliverhayman.com

Github PK Tool:Github PK Tool

scroobiustrip's starred repositories

uv

An extremely fast Python package installer and resolver, written in Rust.

Language:RustLicense:Apache-2.0Stargazers:15730Issues:35Issues:2271

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:10376Issues:105Issues:18

awesome-cold-showers

For when people get too hyped up about things

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonLicense:Apache-2.0Stargazers:1833Issues:43Issues:106

fklearn

fklearn: Functional Machine Learning

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1497Issues:101Issues:51

GLiNER

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024

Language:PythonLicense:Apache-2.0Stargazers:1097Issues:13Issues:104

lingua-py

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

Language:PythonLicense:Apache-2.0Stargazers:1059Issues:12Issues:77

clean-text

🧹 Python package for text cleaning

Language:PythonLicense:NOASSERTIONStargazers:941Issues:14Issues:29

text-clustering

Easily embed, cluster and semantically label text datasets

Language:PythonLicense:Apache-2.0Stargazers:405Issues:33Issues:5

model

The Clay Foundation Model (in development)

Language:PythonLicense:Apache-2.0Stargazers:279Issues:20Issues:120

looper

A resource list for causality in statistics, data science and physics

pycon2024

Tutorial Materials for "The Fundamentals of Modern Deep Learning with PyTorch" workshop at PyCon 2024

Language:Jupyter NotebookLicense:MITStargazers:220Issues:11Issues:0

chars2vec

Character-based word embeddings model based on RNN for handling real world texts

Language:PythonLicense:Apache-2.0Stargazers:172Issues:5Issues:12

sage

SAGE: Spelling correction, corruption and evaluation for multiple languages

Language:Jupyter NotebookLicense:MITStargazers:120Issues:9Issues:1

ding_causalInference_python

python implementation of Peng Ding's "First Course in Causal Inference"

Language:Jupyter NotebookStargazers:112Issues:2Issues:0

python-Levenshtein

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

spelling

This is a neural spell checker

Backfill-GA4-to-BigQuery

Backfill-GA4-to-BigQuery" repository offers a solution for users to backfill their GA4 data into BigQuery. This is useful for those who need historical data from the start of their GA4 property, as GA4 data is typically only available in BigQuery after linking the two services. Our solution provides a complete backfill of data to BigQuery

feedx

Transparent, robust and trustworthy A/B experimentation for Shopping feeds.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:33Issues:4Issues:1

DBT-GA4

A DBT example project demonstrating data modelling transformations for the standard-format Google Analytics 4 BigQuery Export

Language:DockerfileLicense:MITStargazers:27Issues:0Issues:0

prob-epi

Course materials of "Bayesian Modelling and Probabilistic Programming" with NumPyro, initially created for "AI for Science" MSc at African Institute for Mathematical Sciences (AIMS)

Language:Jupyter NotebookLicense:MITStargazers:25Issues:3Issues:2

causy

Causal discovery made easy.

Language:PythonLicense:MITStargazers:21Issues:3Issues:21

stancon2023

Materials for StanCon 2023

Language:Jupyter NotebookStargazers:21Issues:11Issues:2
Language:PythonLicense:MITStargazers:12Issues:0Issues:0

intro-to-ml-with-time-series-workshop-2023

Introduction to Machine Learning with Time Series workshop

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9Issues:0Issues:0

textpp-ptbr

Common Text Pre-Processing for Portuguese

Language:PythonLicense:MITStargazers:6Issues:0Issues:0

numpyro-sts

Extends numpyro with distributions for structural timeseries

Language:PythonStargazers:3Issues:0Issues:0

m6_slides

My M6 presentation slides

Stargazers:2Issues:0Issues:0

convpybayes

A Bayesian version of the markovianhq / convpy library for for lagged conversion rate estimation.

Language:PythonStargazers:2Issues:1Issues:0

fast_edit_distance

A quick implementation of edit distance with improved runtime.

Language:CLicense:MITStargazers:1Issues:0Issues:0