PyData Berlin 2016 Materials

Keynotes

Olivier Grisel, Evolution of the pydata ecosystem

Julia Evans, How to trick a neural network

http://jvns.ca/blog/2016/05/21/a-few-notes-from-my-pydata-berlin-keynote/

We McKinney, Python Data Ecosystem: Thoughts on Building for the Future

http://de.slideshare.net/wesm/python-data-ecosystem-thoughts-on-building-for-the-future

Regular

Daniel Kirsch, Functional Programming in Python

https://github.com/kirel/functional-python

Trent McConaghy, BigchainDB: a Scalable Blockchain Database, in Python

https://github.com/bigchaindb/bigchaindb

David Higgins, Introduction to Julia for Python programmers

https://github.com/daveh19/pydataberlin2016

Katharina Rasch, What every Data Scientist should know about data anonymization

https://github.com/krasch/presentations/blob/master/pydata_Berlin_2016.pdf

Alexander Sibiryakov, Frontera: open source, large scale web crawling framework

https://github.com/scrapinghub/frontera

Thomas Reineking, Plumbing in Python: Pipelines for Data Science Applications

Yamal: Not yet Opensourced

Ryan Henderson, image-match: a python library for searching for similar images in large corpora

https://github.com/ascribe/image-match

Ian Ozsvald, Statistically Solving Sneezes and Sniffles (a work in progress)

Felix Biessmann, Predicting Political Views From Text

https://github.com/felixbiessmann/

Jie Bao, ExpAn - A Python Library for A/B Testing Analysis

Anne Matthies, Zero-Administration Data Pipelines using AWS Simple Workflow

https://github.com/babbel/floto

Daniel Moisset, Bridging the gap: from Data Science to service

https://github.com/machinalis/slides/tree/master/data-science-to-service

Katharine Jarmul, Holy D@t*! How to Deal with Imperfect, Unclean Datasets

https://docs.google.com/presentation/d/1G-lgHKTdrqeeJhcvVmd7C9gOIfTRe429zhBN6lmKKzA/

Nora Neumann, Usable A/B testing – A Bayesian approach

https://speakerdeck.com/nneu/b-testing-a-bayesian-approach

Frank Kaufer, Building a polyglot Data Science Platform on Big Data systems.

Anton Dubrau, Using small data in the client instead of big data in the cloud

Nils Magnus, Dealing with TBytes of Data in Realtime

Abhishek Thakur, Classifying Search Queries without User Click Data

Nathan Epstein, Machine Learning at Scale

Angelos Kapsimanis, The Simple Leads To The Spectacular

Edouard Fouché, Accelerating Python Analytics by In-Database Processing

Jessica Palmer, Python and TouchDesigner for Interactive Experiments

Maciej Gryka, Removing Soft Shadows with Hard Data

Andreas Lattner, Setting up predictive analytics services with Palladium

Martina Pugliese, Spotting trends and tailoring recommendations: PySpark on Big Data in fashion

Andrej Warkentin, Visualizing FragDenStaat.de

James Powell, The kwarg problem

Moritz Neeb, Bayesian Optimization and it's application to Neural Networks"

Kashif Rasul, What's new in Deep Learning?

Jakob van Santen, The IceCube data pipeline from the South Pole to publication

Matthew Honnibal, Designing spaCy: A high-performance natural language processing (NLP) library written in Cython

Valentine Gogichashvili, Data Integration in the World of Microservices

Michelle Tran Chain, Loop & Group: How Celery Empowered our Data Scientists to Take Control of our Data Pipeline

Guertel Idai, Artificial Body Representation in Robots, Expectation and Surprise

Robert Meyer, pypet: A Python Toolkit for Simulations and Numerical Experiments

Ronert Obst and Dat Tran, PySpark in Practice

Juha Suomalainen, Visualizing research data: Challenges of combining different datasources

Danny Bickson, Python based predictive analytics with GraphLab Create

Jose Quesada, A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons

Fang Xu, Connecting Keywords to Knowledge Base Using Search Keywords and Wikidata

Delia Rusu, Estimating stock price correlations using Wikipedia

Dr. Markus Abel, Python Learns to Control Complex Systems

Tutorials

Frank Gerhardt, Using Spark - with PySpark

https://gitlab.com/gerhardt.io/pyspark-workshop

Lukasz Czarnecki, Brand recognition in real-life photos using deep learning

http://de.slideshare.net/ukaszCzarnecki/brand-recognition-in-reallife-photos-using-deep-learning-lukasz-czarnecki-pydata-berlin-2016/

Lev Konstantinovskiy, Practical Word2vec in Gensim

Shoaib Burq, Which city is the cultural capital of Europe? An introduction to Apache PySpark for GeoAnalytics

Lightning Talks

Oliver Zeigermann

https://djcordhose.github.io/big-data-visualization/2016_pydata_berlin_lightning.html#/

Piotr Migdał, Teaching machine learning

Mentioned tools:

Pybuilder: Tired of writing setup.py? http://pybuilder.github.io/
Sputnik: Package manager for Data https://github.com/spacy-io/sputnik

About

Collection of pointers to slides and repositories from speakers at PyData Berlin 2016