yashmayya / kafka-connect-hackernews

A Kafka Connector to read items from Hacker News and stream it into Kafka. Because why not ¯\_(ツ)_/¯

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kafka-connect-hackernews

A Kafka Connector to read items from Hacker News and stream it into Kafka. Because why not ¯\_(ツ)_/¯

Overview

This source connector reads items (stories, comments, jobs, Ask HNs, polls) from Hacker News via https://github.com/HackerNews/API. Items are read serially starting from initial.start.item (defaults to 1). Currently, only a single connector task is supported.

Installation

Run mvn clean package from the repo's root and then copy and unzip the zip archive created in target/components/packages/ to any directory on your Connect worker's plugin path.

Configuration

These are the supported configs :-

Name Description Type Importance
kafka.topic Topic to write to String High
poll.interval.ms Interval between polls (ms) Long High
initial.start.item Hacker News item id to start reading from Long Medium
max.items Maximum number of items to read from Hacker News or less than 1 for unlimited Long Medium

An example config for this connector :-

{
  "name": "HN",
  "connector.class": "com.github.yashmayya.kafka.connect.hackernews.HackerNewsSourceConnector",
  "value.converter": "org.apache.kafka.connect.json.JsonConverter",
  "value.converter.schemas.enable": "false",
  "kafka.topic": "hn-items",
  "poll.interval.ms": "100",
  "initial.start.item": "1"
}

TODO

  • Implement offset tracking and recovery
  • Support dynamic reloading of max item id so that the connector can run forever
  • Add support for schemas

About

A Kafka Connector to read items from Hacker News and stream it into Kafka. Because why not ¯\_(ツ)_/¯

License:Do What The F*ck You Want To Public License


Languages

Language:Java 100.0%