spark-streaming

There are 35 repositories under spark-streaming topic.

Angel-ML / angel
A Flexible and Powerful Parameter Server for large-scale machine learning
machine-learning parameter-server spark scala model high-dimensional online-learning spark-streaming
Language:Java 6780
lw-lin / CoolplaySpark
酷玩 Spark: Spark 源代码解析、Spark 类库等
apache-spark spark spark-streaming sparkcore structured-streaming
Language:Scala 3487
CodeRayZhang / Movie_Recommend
基于Spark的电影推荐系统，包含爬虫项目、web网站、后台管理系统以及spark推荐系统
hadoop hive mysql nginx scala scrapy spark-mllib spark-streaming ssm-maven
Language:Java 2977
dotnet / spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
spark csharp dotnet analytics bigdata spark-streaming spark-sql machine-learning fsharp dotnet-core dotnet-standard streaming apache-spark tpcds tpch azure hdinsight databricks emr microsoft
Language:C# 2085
jacksu / utils4s
scala、spark使用过程中，各种测试用例以及相关资料整理
spark scala-demo scala-spark json4s breeze scala akka spark-streaming
Language:Scala 1087
edp963 / wormhole
Wormhole is a SPaaS (Stream Processing as a Service) Platform
wormhole stream-processing spark-streaming
Language:JavaScript 979
microsoft / Mobius
C# and F# language binding and extensions to Apache Spark
spark apache-spark rdd dataframe dstream dataset streaming csharp mobius kafka-streaming spark-streaming fsharp bigdata mapreduce eventhubs near-real-time
Language:C# 940
cdapio / cdap
An open source framework for building data analytic applications.
cdap dataset integration java java-8 mapreduce middleware platform python spark spark-streaming unified
Language:Java 783
lw-lin / streaming-readings
Streaming System 相关的论文读物
stream-processing streaming flink spark-streaming storm heron dataflow drizzle millwheel s4 apache-spark streaming-engine spe stream-processing-engine
735
Stratio / sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
streaming-data scala stratio spark streaming spark-streaming olap kafka hdfs workflow sparta analytics real-time sparksql stratio-sparta lambda triggers acctbl-compae
Language:Scala 528
spirom / LearningSpark
Scala examples for learning to use Spark
scala spark spark-streaming sparkcore sparksql
Language:Scala 445
databrickslabs / dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
data-generation databricks datagen datageneration datagenerator delta-live-tables deltalake faker pyspark python spark spark-streaming synthetic-data
Language:Python 432
harbby / sylph
Stream computing platform for bigdata
java sylph flink spark-streaming big-data sql streamsql
Language:Java 407
dqx
databrickslabs / dqx
Databricks framework to validate Data Quality of pySpark DataFrames
data-profiling data-quality data-quality-checks data-quality-monitoring databricks dlt spark spark-streaming
Language:Python 331
pathwaycom / pathway-benchmarks
Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
benchmark-framework flink latency pathway spark-streaming streaming streaming-data kafka-streams pagerank wordcount
Language:Python 309
microsoft / data-accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
spark spark-streaming spark-sql sparksql streaming-data streaming servicefabric nodejs docker hdinsight cosmosdb react azure apache-spark iothub eventhub big-data iot kafka kafka-streams
Language:C# 306
paypal / gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
aerospike big-data cassandra data-api elasticsearch gimel hbase jdbc kafka paypal pyspark python restapi scala spark spark-streaming streaming-sql teradata
Language:Scala 246
Azure / azure-event-hubs-spark
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
spark spark-streaming azure scala eventhubs real-time streaming continuous apache apache-spark microsoft event-hubs connector databricks stream structured-streaming bigdata ingestion kafka
Language:Scala 238
mkuthan / example-spark
Spark, Spark Streaming and Spark SQL unit testing strategies
spark spark-streaming testing
Language:Scala 216
Chabane / bigdata-playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
docker spark-sql scala kafka hbase parquet avro nodejs angular graphql mongodb machine-learning big-data hadoop apache-spark apache-flink spark-streaming twitter-api python kops
Language:TypeScript 209
spirom / spark-streaming-with-kafka
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
kafka scala spark spark-streaming
Language:Scala 199
bdp
bluishglc / bdp
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
bigdata prototype quickstart spark spark-streaming spark-sql demo oozie redis kafka spark-demo spark-streaming-examples sqoop sqoop-import sparksql spark-examples middle-end middle-office
Language:Java 198
hortonworks / streamline
StreamLine - Streaming Analytics
streaming real-time storm spark-streaming kafka kafka-streams flink
Language:Java 165
awantik / pyspark-learning
Updated repository
pyspark spark spark-streaming
Language:Jupyter Notebook 157
P7h / Spark-MLlib-Twitter-Sentiment-Analysis
:star2: :sparkles: Analyze and visualize Twitter Sentiment on a world map using Spark MLlib
spark spark-streaming spark-mllib scala twitter-sentiment-analysis visualization machine-learning naive-bayes-classification
Language:Scala 140
qubole / kinesis-sql
Kinesis Connector for Structured Streaming
spark structured-streaming kinesis real-time-processing spark-structured-streaming spark-streaming
Language:Scala 137
mkuthan / example-spark-kafka
Apache Spark and Apache Kafka integration example
spark spark-streaming kafka
Language:Scala 124
LearningJournal / Spark-Streaming-In-Python
Apache Spark 3 - Structured Streaming Course Material
apache-spark big-data bigdata data-lake pyspark python spark-sql spark-streaming
Language:Python 123
wangj1106 / recommendMoteur
电影推荐系统、电影推荐引擎、使用Spark完成的电影推荐引擎
recommendation-engine recommender-system recommendation movies als spark spark-streaming spark-sql flume kafka
Language:Scala 117
abulbasar / pyspark-examples
Code examples on Apache Spark using python
apache spark spark-streaming python
Language:Jupyter Notebook 108
ApacheSpark
martandsingh / ApacheSpark
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
apachespark data-analysis data-engineering database databricks datalake deltalake etl-pipeline hadoop hive spark spark-sql spark-streaming timetravel etl pyspark sql
Language:Python 103
snowch / movie-recommender-demo
This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
cloudant python-flask-application spark spark-streaming hadoop kafka messagehub dsx notebook machine-learning alternating-least-squares bluemix redis collaborative-filtering biginsights ibm-bluemix bokeh hive ibm-biginsights jupyter-notebook
Language:Jupyter Notebook 98
huangyueranbbc / Spark_ALS
基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
spark spark-mllib als spark-streaming spark-streaming-als
Language:Java 96
chermenin / spark-states
Custom state store providers for Apache Spark
spark spark-streaming spark-structured-streaming structured-streaming state-store state stateful apache apache-spark
Language:Scala 92
huangyueranbbc / SparkDemo
spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)
spark hadoop bigdata operator sparkjava spark-sql spark-streaming sparkfun-products sparkline sparkp
Language:Java 85
jmcmt87 / spark_app_twitter
A data engineering project (Twitter monitor app)
kafka mongodb pyspark s3 spark-streaming altair pandas python
Language:Python 85

spark-streaming

Angel-ML / angel

lw-lin / CoolplaySpark

CodeRayZhang / Movie_Recommend

dotnet / spark

jacksu / utils4s

edp963 / wormhole

microsoft / Mobius

cdapio / cdap

lw-lin / streaming-readings

Stratio / sparta

spirom / LearningSpark

databrickslabs / dbldatagen

harbby / sylph

databrickslabs / dqx

pathwaycom / pathway-benchmarks

microsoft / data-accelerator

paypal / gimel

Azure / azure-event-hubs-spark

mkuthan / example-spark

Chabane / bigdata-playground

spirom / spark-streaming-with-kafka

bluishglc / bdp

hortonworks / streamline

awantik / pyspark-learning

P7h / Spark-MLlib-Twitter-Sentiment-Analysis

qubole / kinesis-sql

mkuthan / example-spark-kafka

LearningJournal / Spark-Streaming-In-Python

wangj1106 / recommendMoteur

abulbasar / pyspark-examples

martandsingh / ApacheSpark

snowch / movie-recommender-demo

huangyueranbbc / Spark_ALS

chermenin / spark-states

huangyueranbbc / SparkDemo

jmcmt87 / spark_app_twitter