MohammadRuhulAmin / Apache-Kafka

Apache kafka distributed Message streaming platform

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Apache Kafka

What is Apache kafka ?
=> Apache kafka is a distributed message streaming platform that uses publish and subscribe mechanism to stream the records.It is orginally developed by Linkedin and later doneted to Kafka Foundation. It is opensource and used by many tech jiant company like linkedin , walmart,Netflix,uber,airbnb etc.


What is stream ?
=> it means flow of data


what is record ?
=> According to kafka it is Data.


What is centralized database ?
=> Where all the data will be stored in one place. if database gets affected, all the data will be lost .


What is distributed database?
=> instead of loading one database , the data will be stored in multiple locations.If one database gets affected , other data base will remain safe.

It has two type.
1.When we copy the full Entity to the multiple location , we consider it as replication
2.When we break the entity and store it to different location part by part means the whole data
can be distributed in equally or randomly.


What is Messaging System ?
Records , Data and messages these three are same in messaging System! =>A messaging system is responsible for transferring data from one application to another so the applications can focus on data without getting bogged down on data transmission and sharing.

it is two type :
a. point to point messaging system
*messages are parsed in queue
* A perticular message can be consumed by a maximum of one receiver only
* There is no time dependency laid for the reveiver to receive the message
* When the receiver receives the message, it will send an acknoledgement back to the Sender.
b. public subscribe messaging
system: * Messages are persisted in a Topic
* A particular message can be consumed by any number of consumers.
* There is a time dependency laid for the consumer to consume the message.
* when the subscriber receives the message, it doesnt send an acknoledgement to the publisher.


What is Topic ?
=>Topic is like a queue with some additional featers!messages doesnot delete in topic.The subscriber can get the messages from the topic when they need in a limited time interval.Kafka is a public subscribe messaging system.

---# Apache Kafka What is Apache kafka ?
=> Apache kafka is a distributed message streaming platform that uses publish and subscribe mechanism to stream the records.It is orginally developed by Linkedin and later doneted to Kafka Foundation. It is opensource and used by many tech jiant company like linkedin , walmart,Netflix,uber,airbnb etc.


What is stream ?
=> it means flow of data


what is record ?
=> According to kafka it is Data.


What is centralized database ?
=> Where all the data will be stored in one place. if database gets affected, all the data will be lost .


What is distributed database?
=> instead of loading one database , the data will be stored in multiple locations.If one database gets affected , other data base will remain safe.

It has two type.
1.When we copy the full Entity to the multiple location , we consider it as replication
2.When we break the entity and store it to different location part by part means the whole data
can be distributed in equally or randomly.


What is Messaging System ?
Records , Data and messages these three are same in messaging System! =>A messaging system is responsible for transferring data from one application to another so the applications can focus on data without getting bogged down on data transmission and sharing.

it is two type :
a. point to point messaging system
*messages are parsed in queue
* A perticular message can be consumed by a maximum of one receiver only
* There is no time dependency laid for the reveiver to receive the message
* When the receiver receives the message, it will send an acknoledgement back to the Sender.
b. public subscribe messaging
system: * Messages are persisted in a Topic
* A particular message can be consumed by any number of consumers.
* There is a time dependency laid for the consumer to consume the message.
* when the subscriber receives the message, it doesnt send an acknoledgement to the publisher.


What is Topic ?
=>Topic is like a queue with some additional featers!messages doesnot delete in topic.The subscriber can get the messages from the topic when they need in a limited time interval.Kafka is a public subscribe messaging system. A stream of messages belonging to a perticular catagory called Topic, similar to a table in a database the unique identifier of a topic is it's Name. We can create topic as many as we want!
Topic Has two categories.

  1. Partision
    (All the messages are splitted in to partision. it is ordered and immutable. Each partision has unique id name offset) Producer application will produce messages in Partision and the consumer will get messages from pertision.
  2. Replication
    (It is backup of partision).Producer application will never write messages to replica and the consumer will never get messsages from replica.replica can never read or write data

What is Broker ?
=> Broker is a software process who manage the topics.Broker can track the information that which consumer application has access the amount message from which topic from a pertision!Broker is also known as kafka server.

What is Kafka Cluster ?
=>A set of brokers who are communicating with eachother to perform the management and maintance task are collectively known as kafka cluster.


Kafka Architecture

Describe Kafka Architecture
=>Kafka Cluster is made of some broker.inside a broker there are some topics.Topics are splitted into some partisions.Kafka cluster is managed by zookeeper cluster. the zookeeper cluster has some zookeeper nodes.

There will be some procuder applications. Some producer may produce messages to topic level and some may
produce message in pertision level.

Also there will be some consumer application . They will consume the messages from topic level or pertision level.
There will be consumer group.Each consumer application is assigned inside a consumer Group.


Describe zookeeper
=>Zookeeper cluster mainly manages kafka cluster and coordinate with each broker. As we have already mentioned broker is a software that processes topics.It keeps all the metadata in a key value pairs. Metadata includes:
1.Configuration information
2.Health Status of each broker

A set of zookeepers nodes working together to manage other distributed systems is known as Zookeeper cluster or "Zookeeper Ensemble"

About

Apache kafka distributed Message streaming platform