tddg / ds5110-cs5501-spring24

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DS5110/CS5501: Big Data Systems

Welcome to the graduate course on Big Data Systems. Scalable big data systems are a central part of modern data science. This course will cover topics including design and use of parallel dataflow systems (MapReduce/Hadoop and Spark), scalable and parallel Python analytics frameworks, cloud data systems (cloud storage, cloud-native data processing), and machine learning systems. A major component of this course is hands-on programming using scalable analytics tools and cloud resources such as Amazon Web Services and Google Cloud.

About

License:MIT License


Languages

Language:Jupyter Notebook 95.5%Language:HTML 1.8%Language:SCSS 1.3%Language:Python 1.1%Language:PHP 0.2%Language:Liquid 0.1%Language:Ruby 0.1%