Welcome!

This is Lhotse, a starter kit for writing event sourced application backends following domain driven design principles. It is based on Spring Boot, Axon and Keycloak.

Whether you're starting a new project or refactoring an existing one, you should consider this project if you're seeking:

horizontal scalability with distributed command and event processing
crypto-shredding support for your event log to address privacy regulations such as the GDPR
deduplicating filestore abstractions for a variety of backing stores such as S3 buckets and Mongo GridFS
SSO, role based authorisation and federated identity management

... without the time and effort involved in starting a new project from scratch.

The only end user functionality provided out of the box is basic support for creating organisations and users. The sample code demonstrates end-to-end command handling and event processing flows from API endpoints down to projections.

Tooling
Features
Project Info

Tooling

This project uses Java 17.

Container convenience tooling is Docker.

Project Lombok greatly reduces the need for hand cranking tedious boilerplate code.

The build system is Gradle.

IntelliJ configuration

The Lombok plugin is required for IntelliJ, else the code generated by the Lombok annotations will not be visible (and the project will be littered with red squiggle line errors).

Building

To build the entire application, including running unit and functional tests:

./gradlew build

(Note that functional tests share the same port number for embedded database as for the containerised database, if tests fail try running docker-compose down first. Free-up port for the keycloak test server as well).

Start up containers for Postgres and Keycloak:

docker-compose up

This project uses keycloak for authentication and session management. docker-compose up will run the keycloak container and expose it on the KEYCLOAK_SERVER_PORT specified in the .env file.

Running

Bring up the dependencies: docker compose up

Run the application server using Gradle: ./gradlew bootRun

To create a docker image:

./gradlew bootBuildImage

To run the application server container with a TTY attached, allocating 2GiB memory and applying the prod Spring profile:

docker run -t -m 2G --network host -e "SPRING_PROFILES_ACTIVE=prod" your.organisation.here/lhotse:$BUILD_VERSION

To see all available Gradle tasks for this project:

./gradlew tasks

To run the OWASP dependency check plugin, which will generate a report at build/reports/dependency-check-report.html:

./gradlew dependencyCheckAggregate

To run the dependencies license plugin, which will generate a report at build/reports/dependency-license/index.html:

./gradlew generateLicenseReport

Semantic versioning

Semantic versioning is automatically applied using git tags. Simply create a tag of, say, 1.2.0 and Gradle will build a JAR package (or a docker container, if you wish) tagged with version 1.2.0-0 where -0 is the number of commits since the 1.2.0 tag was applied. Each subsequent commit will increment the version (1.2.0-1, 1.2.0-2, ... 1.2.0-N) until the next release is tagged.

Jupyter notebook

An Jupyter notebook can be found in 'doc/notebook'. It acts as an interactive reference for the API endpoints and should be in your development workflow.

Jupyter notebook can be run as a Docker container:

docker-compose -f doc/notebook/docker-compose.yml up

Copy the notebook URL with token from notebook container logs.
Paste the URL in browser and access the example flows from work/starter-example

Swagger documentation

Swagger API documentation is automatically generated by springdoc-openapi.

API documentation is accessible when running the application locally by visiting Swagger UI. Default credentials for logging in as an administrator can be found in application.properties along with the client ID and client secret.

Functional tests generate a Swagger JSON API definition at ./launcher/build/web-app-api.json

Code style

Spotless is run as part of the Gradle build to ensure consistency. Spotless has been configured to use the Eclipse formatter using the rules in 'build-config/eclipse-formatter-config.xml'. This file can be imported into IntelliJ.

Build checks will fail if the code is inconsistent with the standard. Automatic formatting can be applied by running:

./gradlew spotlessApply

Note that Spotless will not enforce joining of manually wrapped lines in order to keep builder pattern function chaining clean.

PMD and Checkstyle quality checks are automatically applied to all subprojects.

Features

Axon: DDD and event sourcing

Previously known as Axon Framework, Axon is a framework for implementing domain driven design using event sourced aggregates and CQRS.

DDD is, at its core, about linguistics. Establishing a ubiquitous language helps identify sources of overlap or tension in conceptual understanding that may be indicative of a separation of concern in a system. Rather than attempting to model a domain in intricate detail inside a common model, DDD places great emphasis on identifying these boundaries in order to define bounded contexts. These reduce complexity of the system by avoiding anemic domain models due to a slow migration of complex domain logic from within the domain model to helper classes on the periphery as the system evolves.

Event sourcing captures the activities of a business in an event log, an append-only history of every important business action that has ever been taken by users or by the system itself. Events are mapped to an arbitrary number of projections for use by the query side of the system. Being able to replay events offers several significant benefits:

Projections can be optimised for reading by denormalising data
Events can be upcasted. That is, events are marked with a revision that allows them to be transformed to an updated version of the same event. This protects developers from creating significant errors in users' data due to, for example, accidentally transposing two fields within a command or event;
Projections can be updated with new information that was either captured by or derived from events. New business requirements can be met and projections generated such that historical user actions can be incorporated as new features are rolled out;

Axon provides the event sourcing framework. User actions of interest to a business (typically anything that modifies data) are dispatched to command handlers that, after validation, emit events.

Command validation

Commands represent user actions that may be rejected by the system. Events, however, represent historical events and cannot be rejected (though, they can be upcasted or ignored depending on circumstances). It is therefore vital that robust command validation is performed to protect the integrity of the system.

If events are ever emitted in error then this creates a situation that should only be addressed by generating events countering the erroneous ones. This, naturally, comes with a significant cost in terms of implementation and validation overhead.

There is a philosophical argument for defining aggregates such that all information required to validate commands is held by an aggregate in memory. In practice, however, more natural aggregates can be formed by allowing some validation to be based on projections. We also know from experience that some validation will be shared among multiple aggregates. The amount of testing required to verify all possible command failure situations tends to grow non-linearly as the number of checks that are performed inside an aggregate grows.

We have addressed this in the axon-support and command-validation-suport modules through the introduction of marker interfaces that map commands to dedicated command validators. Validators extract common checks or checks based on projections and allow them to be tested independently. Aggregate tests just need to ensure that a failure in the validator fails command validation. Since this design opens up the possibility of a validator to be missed, reflection is used to detect validators at application start up and register them with a command interceptor that is triggered prior to the command handler method being called. This significantly reduces testing effort and human error.

Event processing

Axon provides two types of event processors, subscribing and tracking.

Subscribing processors execute on the same thread that is publishing the event. This allows command dispatching to wait until the event has been both appended to the event store and all event handling is completed. Commands are queued for processing on a FIFO basis. It is important, therefore, to not use them for long-running tasks.

Tracking event processors (TEPs), in contrast, execute in their own thread, monitoring the event store for new events. TEPs track their progress consuming events using tracking tokens persisted in the database. TEPs hold ownership of the tokens, preventing multiple application instances from concurrently performing the same processing. Token ownership passes to another application instance in the event that a token owner is shutdown or restarted.

TEPs introduce additional complexity by not guaranteeing that projections will be up-to-date when an API call has ended. TEPs should, in our opinion, be only used for longer running processing, during replays and when preparing projections for new feature releases.

Axon also introduces the concept of processing groups as a way of segmenting and orchestrating event processing, ensuring that events are handled sequentially within a group. By default, Axon assigns each tracking event processor (TEP) to its own processing group, aiming to parallelise event processing as much as possible. We take a more conservative approach to make the system easier to reason about by defaulting to subscribing event processors and assigning them to a default processing group unless explicitly assigned elsewhere.

Event replays

Event replaying takes the system back to a previous point in time in order to apply a different interpretation of what it means to process an event.

The simplest way of executing a replay is to wipe all projections and then reapply every event ever emitted to rebuild using the latest logic. This is a valid approach for fledgling applications but may not be acceptable once the system has scaled up. More advanced approaches are made possible by assigning event processors to different processing groups and running a mixture of subscribing and tracking event processors. Advanced configuration opens up the possibility of:

Replaying events into a new projection database while continuing to project to an existing one, making the replay transparent to end users. The system can then be switched over to use the new projection while optionally continuing to maintain the old one.
Tracking event processors can be used to generate projections for new features that are not yet released to users until the projections are ready for use.
Processing groups allow replays to be limited to bounded contexts that are naturally isolated.

The starter kit comes with programmatic support for triggering replays. To perform a replay:

disconnect the application from load balancers
(if running more than a single node) shut down event processing using the axonserver-cli or the Axon dashboard
trigger a replay via a Spring Boot Actuator call to /actuator/replay
monitor the state of replay via a Spring actuator endpoint
reconnect the application to load balancers

Behind the scenes, replays are being executed by:

shutting down tracking event processors (TEPs);
clearing the tracking tokens in the Axon database;
placing a marker event into the event log that will end replays;
notifying interested listeners that replays have completed; and
restarting TEP processing.

Crypto shredding

Crypto shredding is a technique for disabling access to sensitive information by discarding encryption keys. You might use this on the behest of a user or when retention is no longer justified in order to comply with the European Union's General Data Protection Regulation (GDPR) without compromising the append-only nature of your event log.

Documentation in the crypto shredding repository explains how it works, its limitations and an important caveat.

Security and access control

We are using Keycloak to manage user authentication and session management. Authorisation is handled by the application itself.

Keycloak has the following three main concepts.

Realms which secure and manages security metadata for a set of users, applications and clients. By default, Keycloak provides a master realm which is best used only for superuser administration. We create a separate realm, default for managing our application.
Clients are the applications on whose behalf Keycloak is authenticating users. By default, Keycloak will provide us with a few clients but using a separate client is the best practice. We have set up a default client for the default realm.
Roles identify a type or category of user. Roles can be specific to a client or apply to an entire realm.

The official documentation goes into more detail.

Endpoint access control

Controller end-points are secured based on user roles, properties and entity permissions. Annotations are used to configure the access control for each handler method. To reduce repetition and improve readability, a few meta-annotations are created for common security configuration, e.g. AdminOrAdminOfTargetOrganization. There are situations when user roles and properties are not sufficient to determine the access control. This is where the entity permission check comes in. An entity in this case is a representation of domain object in the application layer. It corresponds to at least one persistable object. For an example, one Organization entity corresponds to one PersistableOrganization. To put it simply in the event sourcing context, it can be just considered as the projection.

The entity permission check is specified within the security annotation and takes the form of hasPermission(#entityId, 'EntityClassName', 'permissionType'). This expression is evaluated by EntityPermissionEvaluator, which in turn delegates to corresponding permission check methods of an entity, where customized permission requirements can be implemented. This workflow is made possible by: a) having all entity classes implementing the Identifiable interface and b) having a ReadService for each Identifable entity. The Identifiable interface provides default reject all permission checks which can be overridden by implementing entities. The ReadService provides a way to load an entity by its simple class name. To help manage increasing number of ReadService, the starter kit provides a ReadServiceProvider bean which collects all ReadService beans during start of the application context.

When adding new controllers and security configurations, it is important to refer to existing patterns and ensure consistency. This also applies to tests where fixtures are provided to support the necessary automagic behaviours.

Filestore support

The storage module implements two file stores: one is referred to as permanent, the other as the ephemeral store. The permanent file store is for storing critical files that, such as user uploads, cannot be recovered. The ephemeral store is for non-critical files that can be regenerated by the system either dynamically or via an event replay.

Our file store implementation automatically deduplicates files. Storing a file whose contents matches a previous file will return a (new) file identifier mapping to the original. The most recently stored file will then be silently removed.

File stores need backing service such as a blob store or filesystem. This starter kit supports an in-memory filestore, Mongo GridFS and AWS S3.

Configuring the In-Memory Filestore

The in-memory filestore backend is intended only for development and testing. This filestore is not distributed so will not work well when running multiple instances of the application in HA mode. A locally hosted or AWS hosted S3 compatible filestore is a better bet in this instance.

application.filestore.backend=inMemory

Configuring Mongo GridFS

Set the application property:

application.filestore.backend=mongoGridFs

Configuring AWS S3

Set the following application properties:

application.filestore.backend=awsS3
application.filestore.awsS3.buckets.permanent=sample-bucket-permanent
application.filestore.awsS3.buckets.ephemeral=sample-bucket-ephemeral

We rely on DefaultAWSCredentialsProviderChain and DefaultAwsRegionProviderChain for fetching AWS credentials and the AWS region.

Media support

The media module adds additional support for managing of image and video updates. It generates thumbnail images on the fly, caching them in the ephemeral file store for subsequent requests. Thumbnail sizes are limited to prevent the system from being overwhelmed by malicious requests.

Project Info

Maintainers

@sluehr, @ywangd

Contributing

We appreciate your help!

Open an issue or submit a pull request for an enhancement. You may want to view the project board or browse through the current open issues.

License

Talk to us hi@everest.engineering.

everest-engineering / lhotse