There are 1 repository under sparkr topic.
A curated list of awesome Apache Spark packages and resources.
Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker. :zap:
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Real-world Spark pipelines examples
Azure Databricks - Advent of 2020 Blogposts
Taller SparkR para las Jornadas de Usuarios de R
Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.
Practice and Workshop on BigData and Cloud Computing using Docker Containers and OpenNebula. HDFS, hadoop and spark+R
Slides and lab material for the talk R for HPC and big data at http://rsummer.data-analysis.at
Fit a Cubist regression model on StackOverflow data and make predictions in a distributed manner with SparkR
Docker images for testing SparkR builds
A curated list of essential cheatsheets for data analysis, visualization and machine learning using R or Python
Taller Big Data con Apache Spark + R desde Databricks cloud
R workloads running at scale on Google Cloud
Course material for the "Encounters with Big Data" course delivered by the UK Data Service at the 2017 Big Data and Analytics Summer School.
This repository you are browsing contains intermediate level piece of codes which are useful for cleaning, exploratory analysis, handling of missing data points, outlier detection and different visualization techniques using graphics, ggplot2, tidycharts, ggExtra packages. Also in particular part of the script you can get basic information about SparkR package which is an R package that provides a light-weight frontend to use Apache Spark from R . Do not be shy to fork and make contribute.
Multiple-Node Standalone Spark with R and Python
Bi and Big Data Analytics, sparkR, Supervised and Unsupervised Machine Learning techniques The project's aim is of applying a supervised and an unsupervised machine learning technique on a dataset to test different models/scenario, interpret the results, perform predictions for each model and visualised the results.
This is a demonstration of using Spark to explore large dataset, by using PySpark and SparkR. The files include loading data, data exploration and using clustering on words of Shakespeare's novels.
Data analysis and Model building on large datasets using Hive and Spark
Extra docker images from rocker/tidyverse