wangzhanxd / pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Source Code for PySpark Algorithms Book

Unlock the Power of Big Data by PySpark Algorithms book


PySpark Algorithms Book:

Author: Mahmoud Parsian (mahmoud.parsian@yahoo.com)

Publication date: August 2019


About PySpark Algorithms Book

  • This book is about PySpark (Python API for Spark)
  • Introductory book on how to solve data problems using PySpark
  • Learn how to use mappers, filters, and reducers
  • Learn how to partition data for fast queries
  • Learn how to use the mapPartitions() transformation
  • Learn how to use reduceByKey(), groupByKey(), and combineByKey() transformations
  • Learn how to use Spark's transformations and actions for solving real problems
  • Learn how to use RDDs and DataFrames
  • Learn how to read/write data from many data sources
  • Learn how to use Logistic regression
  • Learn how to use Spark's reduction transformations
  • Learn how to use GraphFrames
  • Learn how to use Motifs in GraphFrames
  • Learn how to use Monoids in MapReduce algorithms

PySpark Algorithms Book


Software


Table of Contents

chap01: Introduction to PySpark
chap02: Hello World
chap03: Data Abstractions
chap04: Getting Started -- Sample Chapter
chap05: Transformations in Spark
chap06: Reductions in Spark
chap07: DataFrames and SQL
chap08: Spark DataSources
chap09: Logistic Regression
chap10: Movie Recommendations
chap11: Graph Algorithms
chap12: Design Patterns and Monoids

Appendix A: How To Install Spark
Appendix B: How to Use Lambda Expressions
Appendix C: Questions And Answers (50+ QA)


Future chapters:

chap13: FP-Growth
chap14: LDA
chap15: Linear Regression


PySpark Algorithms Book

About

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

License:Other


Languages

Language:Python 85.0%Language:Shell 15.0%