learning pyspark
Books
d) PySpark Cookbook
Certification
DataBricks Certified Developer
Note: Databricks says, this Certification will no longer be available after 31 Oct 2019.
DataBricks Certified Associate
This is coming soon,as per portal.
Git Repositories,books
Spark Internals by Jerry Lead
gitbook by Jacek Laskowski
another gitbook
advices on certification
Tutorials by Mahmoud Parsian
Talks by Daniel Abadi
RDD
See RDD notes
See A primer on Lambda
Dataframes
See Dataframe notes
Spark Internals, architecture, tuning
See architecture
Spark SQL
See spark-sql
Spark Streaming
See spark-streaming
GraphX
Machine Learning
Machine Learning - Feature Engineering
Scala
Python
Other resources
Sequence file
hdfs
External spark packages
Blogs: http://blog.madhukaraphatak.com/
http://www.cs.sfu.ca/CourseCentral/732/ggbaker/content/spark.html
https://console.bluemix.net/docs/services/AnalyticsforApacheSpark/using_spark-submit.html
#running-a-spark-application-using-the-spark-submit-sh-script https://developer.ibm.com/clouddataservices/docs/analytics-engine/get-started/
Questions/Comments
Please send me email at: kanchan.tewary@gmail.com