mikeoertli / mikeroweservice

Mike O's Mike Rowe Service microservice - a "learning playground" - i.e. a demo/sample project using Spring Boot, Kafka, GraphQL, Elasticsearch, REST, SQL and NoSQL, and a microservice architecture that is deployed with Docker (Compose).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸ‘·β€ Mike's Mike Rowe Service Microservice πŸ‘·β€

Mike Rowe (source: thepodcastplayground.com)

In order to learn more about microservices, I figured...
What better way than with a Mike Rowe Service?

πŸ™ŠπŸ˜œ

Overview

A demo/sample project using Spring Boot, Kafka, REST, SQL, NoSQL, GraphQL, Gradle multimodule builds, etc., in a microservice architecture.

This is not affiliated with Mike Rowe, Dirty Jobs, 'The Way I Heard It', or any other Mike-Rowe-based entity/product in ANY way.

🚨 Setting Expectations 🚨

  1. πŸ€“ This repo and its contents exist solely for the purpose of πŸ”¬πŸ§  learning/playing with a few technologies and concepts (in no particular order):
    1. Microservice Architecture (see: Architecture)
    2. Spring Boot (throughout)
    3. REST APIs (see: Client API)
    4. SQL with Spring Boot JPA (details TBD, maybe Postgres? Might also remove this module.)
    5. NoSQL (in this case, MongoDB)
    6. GraphQL (see: GraphQL Adapter)
    7. Elasticsearch (see: Transcript Service)
    8. Kafka (Streams) (see: Sentiment Analysis Stream Processor)
    9. a monorepo
    10. gradle multimodule builds, and more.
  2. 🚧 This is nowhere near complete – not even in the "do an end-to-end 'hello world' test" sense.
  3. 🀨 There is a disconnect between current code structure and the diagram below. Very much a work-in-progress!
  4. 🐣 There is no plan to fully implement each of the modules/services.
  5. πŸ₯Έ Data will be canned and mocked in most cases.
  6. 🐳 For now, this uses docker-compose
    1. There is a tentative future plan to deploy this with K8s since I'd like to get more hands-on with that too, but no ETA on that.
  7. 🚨 This was created to learn about these technologies/concepts. ⚠️ DO NOT consider this as a reference for how to do anything the "right" way! ⚠️
  8. πŸ™ˆ Apologies in advance for the lack of tests and javadocs, I plan to add both. The plan was to get an outline with stubbed modules in place first.
  9. πŸ‘‹ Please don't hesitate to reach out with suggestions etc.

Essentially, having lacked any production microservice experience prior to this repo's inception, I wanted a "playground" in which to explore/learn a bunch of concepts and technologies.

πŸ› πŸ‘·πŸ»πŸ§± Architecture

This is my "thinking out loud" diagram for what this could look like.

This is NOT FINAL, nor is it a reflection of the code as it currently exists.

There are several notes embedded in the diagram which describe nuances, considerations, or theoretical "what if" ideas.

Microservice Architecture Diagram

πŸ•“ Timestamps

All timestamps are either unix epoch millis (converter) (as stated by variable/field name) or ISO-8601 (ex: 2012-04-23T18:25:43.511Z).

πŸ“‚ Directory Structure

These are subject to change, this is a preliminary structure based on some early plans and has yet to be adapted to a (slightly) more well-considered design.

β”œβ”€β”€ README.md
β”œβ”€β”€ build.gradle
β”œβ”€β”€ client-api
β”œβ”€β”€ doc
β”‚   └── architecture-summary.png
β”œβ”€β”€ docker
β”‚   └── mapped-volumes
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ libs
β”‚   β”œβ”€β”€ common
β”‚   β”œβ”€β”€ db-adapter (removed)
β”‚   β”œβ”€β”€ graphql-adapter
β”‚   β”œβ”€β”€ kafka-adapter
β”‚   β”œβ”€β”€ model
β”‚   └── mongo-adapter
β”œβ”€β”€ services
β”‚   β”œβ”€β”€ imdb-service
β”‚   β”œβ”€β”€ notification-service
β”‚   β”œβ”€β”€ podcast-service
β”‚   β”œβ”€β”€ transcript-service
β”‚   └── twitter-service
β”œβ”€β”€ settings.gradle
└── stream-processors
    └── sentiment-processor

πŸ§‘β€πŸ’» Client API πŸ§‘β€πŸ’»

There is a client API that is the "main" REST API for accessing this Mike Rowe "Service" backend.

The exact endpoints are TBD for now, but some ideas include:

  • Get the link to the latest podcast that mentions <topic>.
  • Retrieve the last tweet from Mike that has more than <#> likes.
  • Get the highest rated episode of Dirty Jobs since <date>.

When deployed locally, the endpoints will be exposed at: localhost:8080/api/query.

☝️ TIP: You can discover the defined endpoints with the following command:

egrep -R --include "*.java" '@RequestMapping|@GetMapping|@PostMapping|@PutMapping|@PatchMapping|@DeleteMapping' 

πŸ“š Libraries πŸ“š

🀝 Common

This library contains common utilities/tools that are shared between modules. These can be included with:

api project(":common")

πŸ“€ Data Model

This library contains POJO structures that serve as a common data model between services.

api project(":model")

The data model uses code generation courtesy of the Netflix DGS CodeGen
plugin for GraphQL. This lets us leverage the GraphQL style of schema definition that is widely used and flexible and
allows for the common use of POJOs between Kafka, GraphQL, and other modules.

🧩 MongoDB Adapter

This includes the Data Model library.

api project(":mongo-adapter")

An API wrapper around a MongoDB to make it easier for services to use MongoDB without implementing a database instance. This would also make it much easier to replace Mongo with an alternative solution at some point.

🦦 Kafka Adapter

This includes the Data Model and Common libraries.

Services can include this with:

api project(":kafka-adapter")

This is a library that makes it exceedingly easy to integrate with Kafka or Kafka Streams as a publisher or receiver of Kafka data.

In order to leverage the common kafka configuration, topics list, etc., this annotation must be present on the class that is annotated with @Configuration:

@PropertySource("classpath:kafka-application.properties")

Kafka Topics

Kafka topics are defined centrally in libs/kafka-adapter/src/main/resources/kafka-application.properties. For example:

#
# Topics
#
kafka.topic.notification=notification
kafka.topic.transcript=transcript
kafka.topic.sentiment=sentiment
kafka.topic.media=media
kafka.topic.news=news
kafka.topic.social=social
kafka.topic.company=company
kafka.topic.person=person
kafka.topic.subject=subject

πŸ“ˆ GraphQL Adapter

This includes the Kafka Adapter library (which, in turn, includes the Data Model and Common libraries).

api project(":graphql-adapter")

There are numerous resources for GraphQL with Java and Spring Boot. It seems as though I approach this at an inflection point between pre-official-Spring-Boot-GraphQL and post. Right now, it requires a pre-release version of Spring Boot to use. At the time of writing, I have opted for the "old" approach.

Note that the Data Model module houses the POJOs generated by the DGS CodeGen plugin.

This project does leverage the Netflix DGS GraphQL library, there is a great Getting Started Guide on their site.

The official GraphQL Schema resources are really useful.

The GraphQL Schema is defined in: libs/model/src/main/resources/schema/schema.graphqls. This schema defines the data structures and query API for the GraphQL functionality.

As a byproduct of generating the POJOs in the data model while the GraphQL logic lives in the GraphQL module, the auto-generated and example code other than the data model POJOs may require manual manipulation/relocation and therefore isn't as streamlined as part of the build process. Thankfully, that logic rarely changes.

For example, the query APIs look like this:

type Query {
    latestPodcastMentioningTopic(topic: String!): PodcastEpisode
    mostPopularPodcastTopics(numMostPopular: Int): [Topic!]!
    podcastTranscriptByEpisodeNumber(episodeNumber: Int!): Transcript
    podcastByEpisodeNumber(episodeNumber: Int!): PodcastEpisode
    televisionTranscript(showName: String!, seasonNumber: Int!, episodeNumber: Int!): Transcript
    televisionEpisode(showName: String!, seasonNumber: Int!, episodeNumber: Int!): TelevisionEpisode
    mostPopularTelevisionEpisode(showName: String!, seasonNumber: Int): TelevisionEpisode
    mostPopularTweetSince(numDays: Int): SocialMediaPost
    mostRecentTweetWithNumLikes(numLikes: Int): SocialMediaPost
    mostPopularSocialMediaPostSince(numDays: Int): SocialMediaPost
    mostPopularMovies(numMovies: Int): [Movie!]
}

GraphQL Libraries

There are 3 primary choices, and things are especially confusing with Spring Boot's GraphQL starter being in a pre-release phase at the time of writing. I won't summarize it here, but there are some great resources to cover the library options – I especially liked this one by Soham Dasgupta. This one from codingconcepts.com is also helpful.

Schema-First

I have decided to take a "schema-first" approach, which means defining the .graphqls first and then generating POJOs from this schema.

For POJO generation, this project uses the DGS CodeGen Plugin. There is a guide for Getting Started with DGS CodeGen that is worth review too.

🧠 Modules and Services 🧠

These are services which process incoming data and requests in various ways in order to produce results, filter data, generate transcripts, and more.

🎬 Movies and TV Service (IMDB Service)

Retrieves data about TV and movies.

This was called the "IMDB" service, but is likely going to use more "open" alternatives like The Movie Database (TMDB) or Open Movie Database (OMDB).

πŸ“§ Notification Service

Based on the current rules/configuration, publish notifications for certain events.

For example, send a push notification or SMS each time a podcast episode is posted or each time any of Mike Rowe's content mentions the state of Colorado.

Right now, this is a placeholder. Eventually this could just be the service interface for several services behind a curtain, things like Twilio (SMS), Email (... maybe?), IFTTT, SimplePush, AWS SNS, and others.

🎧 Podcast Service

Retrieves episodes of Mike Rowe's podcast The Way I Heard It.

Uses RSS processing with the Feed Adapter Spring Integration (which uses ROME under the hood).

Feeds media to the Transcript Service where a transcript does not yet exist in the cache.

Brainstorm: extract topics, people, etc. and queue up OSINT "jobs" on each.

πŸ’¬ Transcript Service

Retrieves existing transcripts or fetches them using a service like Descript. Note that right now, this just generates mock data in lieu of actually integrating with Descript or similar.

Uses ElasticSearch to store transcripts.

πŸ“ Development Note

To start a simple ElasticSearch container (stand-alone, independent of this project):

docker run -d --name elasticsearch -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.6.2

πŸ₯ Twitter Service

This ia a service which leverages the Twitter4J library to monitor Mike Rowe's Twitter account as well as topics related to Mike Rowe. Data is ingested and published to a Kafka topic.

Monitored topics can be configured in the properties file for this service.

🦦 Kafka Stream Processors 🦦

Stream processors use Kafka Streams to process data.

πŸ‘πŸ‘Ž Sentiment Analysis Stream Processor

Placeholder/future.

The general idea is creating a Kafka Stream processor that takes content that mentions Mike Rowe and generate a sentiment score.

βœ… Key Items on TODO List βœ…

The "TODO" list here is endless, but a few focal points include:

About

Mike O's Mike Rowe Service microservice - a "learning playground" - i.e. a demo/sample project using Spring Boot, Kafka, GraphQL, Elasticsearch, REST, SQL and NoSQL, and a microservice architecture that is deployed with Docker (Compose).


Languages

Language:Java 100.0%