tddg / ds5110-spring23

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DS5110: Big Data Systems

Welcome to the graduate course on Big Data Systems. Scalable big data systems are a central part of modern data science. This course will cover topics including design and use of parallel dataflow systems (MapReduce/Hadoop and Spark), scalable and parallel Python analytics frameworks, cloud data systems (cloud storage, cloud-native data processing), and machine learning systems. A major component of this course is hands-on programming using scalable analytics tools and cloud resources such as Google Cloud and Azure Cloud.

About

License:MIT License


Languages

Language:Jupyter Notebook 45.5%Language:HTML 28.2%Language:SCSS 20.1%Language:PHP 3.0%Language:Liquid 2.3%Language:Ruby 0.9%