There are 15 repositories under pydata topic.
Koalas: pandas API on Apache Spark
STUMPY is a powerful and scalable Python library for modern time series analysis
Extract data from a wide range of Internet sources into a pandas DataFrame.
A distributed task scheduler for Dask
Clean APIs for data cleaning. Python implementation of R package Janitor
A clean, three-column Sphinx theme with Bootstrap for the PyData community
PyData, The Complete Works of
A consistent table management library in python
Resources for Advancing into Analytics: From Excel to R and Python by George Mount (O'Reilly Media, 2021)
Notebooks for the Seattle PyData 2017 talk on Scattertext
Social network analysis code examples for PyCon 2019 talk
Machine learning with scikit-learn tutorial at PyData Chicago 2016
Introduction to Machine Learning with Time Series at PyData Festival Amsterdam 2020
Python library for GraphBLAS: high-performance sparse linear algebra for scalable graph analytics
Repo for my talk at the PyData Berlin 2017 conference
Graph algorithms written in GraphBLAS
Introduction to sktime at the PyData Global 2021
Data and tooling to compare the API surfaces of various array libraries.
WORK UNDER RESTRUCTURING
A `select` accessor for easier subsetting of pandas DataFrames and Series
Accompanying notebook and sources to "A Guide to Pseudolabelling: How to get a Kaggle medal with only one model" (Dec. 2020 PyData Boston-Cambridge Keynote)
This is the code and presentation for my PyData2017 talk "Reverse Image Search Using Out-of-the-box Machine Learning Libraries
Slides and notebooks for my tutorial at PyData London 2018
Material for working alongside my workshop session at PyData Berlin 2018
Speaker slides from monthly meetups and conference
PyData Global Workshop: Jupyter Notebooks in VS Code
@matthewbrems and I presented "Recreating, Understanding, and Visualizing FiveThirtyEight's Elections Forecast" at PyData DC 2018
Scrapy Project Sample for Baseball(NPB)