snithish / kafka-perf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kafka Checks

  1. Clone the repository using
git clone https://github.com/snithish/kafka-perf.git
cd kafka-perf
  1. Create Python Virtualenv
python3 -m venv venv
  1. Activate virtual env
source venv/bin/activate
  1. Install dependencies
pip install -r requirements.txt
  1. Download a large dataset we use the NYC Taxi Dataset (1GB)
 wget https://nyc-tlc.s3.amazonaws.com/trip+data/fhvhv_tripdata_2022-02.csv
  1. Use docker compose to bring up kafka
docker-compose up -d --force-recreate
  1. Use the below command to create a topic
docker exec --interactive --tty broker \
kafka-console-producer --bootstrap-server broker:9092 \
                       --topic quickstart
  1. Run the producer script, you can edit script to increase rows sent to kafka
python3 producer.py

About


Languages

Language:Jupyter Notebook 96.1%Language:Python 3.9%