aws / aws-cdk-rfcs

RFCs for the AWS CDK

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EventBridge Pipes L2 Construct

RaphaelManke opened this issue Β· comments

Description

Amazon EventBridge Pipes (User Guide, CloudFormation resource) enable connections between several aws services and has filtering, transforming and enrichment capabilities that makes connecting aws services much easier.

Although there is no L2 support for this new aws feature yet.
This results in an user experience that can be improved to have a similar experience than the aws console has.

The current L1 cfn constructs give user no hint which services can be connected and what needs to be done for example in regards to iam permissions.
The AWS console provides a nice ui that is split into four phases:
pipes overview
This source, enrichment and target have a dropdown list of possible options.
On top of that the console creates a iam policy that is needed to access all the configured sources.

The current L1 construct api has no type safety and gives the user no clue which sources, enrichments and targets can be used. On top of that the user has to create iam roles and permissions by itself.

Example of L1 construct connecting two sqs queues

const pipeRole = new Role(this, "PipeRole", {
  assumedBy: new ServicePrincipal("pipes.amazonaws.com", {}),
});

const sourceQueue = new Queue(this, "SourceQueue");
const targetQueue = new Queue(this, "TargetQueue");

const Pipe = new CfnPipe(this, "MyPipe", {
  roleArn: pipeRole.roleArn,
  source: sourceQueue.queueArn,
  target: targetQueue.queueArn,
});
sourceQueue.grantConsumeMessages(pipeRole);
targetQueue.grantSendMessages(pipeRole);

I'd suggest to build a L2 construct that gives the user guidance how to build pipes.

Possible class diagram

expand diagram
classDiagram 
  direction LR
  
  class Pipe {
    source PipeSource
    target PipeTarget
    filter? PipeFilter
    enrichment? PipeEnrichment
  }
  
  Pipe --> PipeSource
  Pipe --> PipeTarget
  Pipe --> PipeFilter
  Pipe --> PipeEnrichment
  
  PipeSource --> DynamoDBSource
  PipeSource --> KinesisSource
  PipeSource --> AmazonMqSource
  PipeSource --> AmazonMSKSource
  PipeSource --> SelfManagedKafkaSource
  PipeSource --> SelfManagedKafkaSource
  PipeSource --> SqsSource
  
  PipeTarget --> ApiDestinationTarget
  PipeTarget --> ApiGatewayTarget
  PipeTarget --> BatchJobQueueTarget
  PipeTarget --> CloudwatchLoggroupTarget
  PipeTarget --> EcsTaskTarget
  PipeTarget --> AllOtherTarget
  
  
  class PipeFilter {
    fromObject()
    fromString()
  }
  
  class PipeEnrichment 
  
  PipeEnrichment --> ApiDestinationEnrichment
  PipeEnrichment --> ApiGatewayEnrichment
  PipeEnrichment --> LambdaEnrichment
  PipeEnrichment --> StepFunctionEnrichment

Example usage of the L2 construct

const sourceQueue = new Queue(this, "SourceQueue");
const targetQueue = new Queue(this, "TargetQueue");
const lambdaFunction = new NodejsFunction(this, "LambdaFunction")

const Pipe = new Pipe(this, "MyPipe", {
  source: new SqsSource(sourceQueue),
  target: new SqsTarget(targetQueue, {
    inputTemplate: JSON.stringify({
      body: "<$.body>",
      messageId: "<$.messageId>",
      messageAttributes: "<$.messageAttributes>",
      nestedBody: {
        body: "<$.body>",
      },
    }),
  }),
  filter: PipeFilter.fromObject({ body: [{ prefix: "Test" }] }),
  enrichment: new LambdaEnrichment(lambdaFunction)
});

PoC implementation

https://github.com/RaphaelManke/aws-cdk-pipes-rfc-473

Roles

Role User
Proposed by @RaphaelManke
Author(s) @RaphaelManke
API Bar Raiser @mrgrain
Stakeholders @nikp

See RFC Process for details

Workflow

  • Tracking issue created (label: status/proposed)
  • API bar raiser assigned (ping us at #aws-cdk-rfcs if needed)
  • Kick off meeting
  • RFC pull request submitted (label: status/review)
  • Community reach out (via Slack and/or Twitter)
  • API signed-off (label api-approved applied to pull request)
  • Final comments period (label: status/final-comments-period)
  • Approved and merged (label: status/approved)
  • Execution plan submitted (label: status/planning)
  • Plan approved and merged (label: status/implementing)
  • Implementation complete (label: status/done)

Author is responsible to progress the RFC according to this checklist, and
apply the relevant labels to this issue so that the RFC table in README gets
updated.

Pipe

AWS EventBridge Pipe has itself is a fully managed service that does the heavy lifting of polling a source, then be able to filter out payloads based on filter criteria. This reduces the target invocations and can reduce costs.
After filtering events the resulting events can be enriched in the enrichment phase of a Pipe. The result of the enrichment is then pushed to the Target.
Before passing a payload to the enrichment and Target the payload can be transformed using a input Transformation.
To give the EventBridge Pipe access to the services that are connected in a pipe, each Pipe assumes a IAM Role. This role must have iam policies attached to read from a source, invoke a enrichment service and finally push to a target service.

So a Pipe has the following components:

besides these (core) components that are used while processing data, there are additional attributes that describe a Pipe

  • Name
    • This the (physical-) identifier for the AWS resource, the actual Pipe. It is used in the ARN of the provisioned resource.
  • Description
    • This is text field for humans to identify what the pipe does.
  • Tags
    • AWS tags for the resource
graph LR
classDef required fill:#00941b 
classDef optional fill:#5185f5

Source:::required
Filter:::optional
Enrichment_Input_Transformation[Input transformation]:::optional
Enrichment:::optional
Target_Input_Transformation[Input transformation]:::optional
Target:::required

Source --> Filter --> Enrichment_Input_Transformation --> Enrichment --> Target_Input_Transformation --> Target

Example implementation

interface PipeProps {
	readonly source : PipeSource
	readonly target : PipeTarget
	
	readonly filter? : PipeFilter
	readonly enrichment? : PipeEnrichment
	readonly role? : IRole // role is optional, if not provided a new role is created
	readonly description : string
	readonly tags? : Tags
}

interface Pipe {
	readonly role : IRole
	readonly source : PipeSource
	readonly target : PipeTarget
	
	readonly filter? : PipeFilter
	readonly enrichment? : PipeEnrichment
	readonly description : string
	readonly tags? : Tags

	constructor(scope: Scope, id: string, props:PipeProps)
}

Open questions

  1. Should the input Transformation be part of the PipeProps (alternative: a property of the PipeEnrichment and PipeTarget props) ?
    1. Pro PipeProps:
      1. In the case of a Refactoring, for example replace the target the input transformation doesn't have to be touched/moved
    2. Con PipeProps:
      1. Input transformation can occur twice in a Pipe definition. The naming needs to make sure for which phase the the transformation is meant. E.g. EnrichmentInputTransformation and TargetInputTransformation
      2. Setting the EnrichmentInputTransformation without an PipeEnrichment makes no sense and needs additional validation code. This can be omitted if the inputTransformation is a property of the PipeEnrichment or PipeTarget classes.
  2. Should the PipeFilter be part of the PipeSource property definition instead of a attribute on the Pipeclass?
    1. Pro:
      1. The possible filter keys depend on the source
      2. cloudformation itself put the FilterCriteria into the PipeSourceParameters
    2. Con:
      1. To align with the AWS console it should be on the same level as the Source itself. User that have tested pipes in the console can easier understand the api.
      2. It would be more robust to future AWS changes because the Filter can always be defined based on the cloudformation generated type definitions and don't have to be explicitly build for a new source.

Source

A source is a AWS Service that needs to be polled.
The following Sources are possible:

The CfnPipe resource reference the source only by their ARN. Right now there is no validation in der CDK framework that checks if a ARN is valid or not.
To overcome this shortcoming a PipeSource class representing a source is needed. This PipeSource is then implemented by all the supported sources.

export abstract class PipeSource {

	public readonly sourceArn: string;
	
	public readonly sourceParameters?:
	| CfnPipe.PipeSourceParametersProperty
	| IResolvable;
	
	constructor(
		sourceArn: string,
		props?: CfnPipe.PipeSourceParametersProperty,
	) {
		this.sourceArn = sourceArn;
		this.sourceParameters = props;
	}
	
	public abstract grantRead(grantee: IRole): void;
}

This PipeSource class has a sourceArn that is mapped to the CfnPipe sourceArn attribute.
The sourceParameters are the config options for the source. Depending on the source theses attributes are present under a different key. E.g. for a SQS queue the configuration attributes are:

{
	sqsQueueParameters : {...}
}

The specific Source class implementation hides this detail for the user and provide a interface with only the possible configuration options that are possible for the specific source.

interface PipeSourceSqsQueueParametersProperty {
	readonly batchSize?: number;
	readonly maximumBatchingWindowInSeconds?: number;
}

This interface for example is provided by the cloudformation specification and can be used as a base for possible configurations (additional validation can be added if useful).

To be able to consume a source the EventBridge Pipe has a IAM-role. This role needs to have a policy to read from the source.
The grantRead method need to be implemented for that purpose.
E.g. the SQS can leverage its L2 .grantConsumeMessages() method.

Example implementation

An example api for a source that polls for a SQS-queue then can look like:

export class SqsSource extends PipeSource {

	private queue: IQueue;
	
	constructor(queue: IQueue, props?:CfnPipe.PipeSourceSqsQueueParametersProperty) {

	super(queue.queueArn, { sqsQueueParameters: props });
		this.queue = queue;
	}

	public grantRead(grantee: IRole): void {
		this.queue.grantConsumeMessages(grantee);
	}
}

It takes an existing SQS-queue and polling properties that are possible for that kind of source and does implement a grantRead method which creates the required IAM policy for the Pipe role.

Role

A IAM role is required that can be assumed by the pipes.amazonaws.com principal. This role needs IAM policies attached to read from a PipeSource, invoke a PipeEnrichment and push to a PipeTarget.
The user can bring its own role. If the user does not provide a role, a new role will be created. In both cases the role should be exposed by the Pipe class so it is transparent for user which role is used within the Pipe.

Open questions

  1. How can be assured the pipes service has access to encrypted sources and targets? The role or pipes principal needs access to KMS.
  2. Can we allow IRole or do we need to make a restriction to allow Role only?
    1. We have to make sure the generated policies are attached to the role in both cases. If restricted to Role this can easily done by using L2 construct methods of the role or the source, enrichment or target and pass the role along. If a IRole is provided the role policies cannot be extended.

Filter

A filter does pattern matching based on the incoming payload and the specified filter criteria's. The matching is done in the same way the EventBridge pattern are matched.
The possible fields that can be used to filter incoming payloads are depending on the source.

Example Implementation

The implementation is split into two types.

  1. generic Filter
    1. this filter is the basic class for defining a filter. It represent 1:1 the cloudformation filter specification.
  2. Source specific filter
    1. this filter gives the user guidance on which attributes for this specific source a filter can be created. It then takes care of that the actual data-key e.g. data, body, dynamodb see docs.
interface IPipeFilterPattern {
	pattern: string;
}

class PipeGenericFilterPattern {
	static fromJson(patternObject: Record<string, any>) :IPipeFilterPattern {
		return { pattern: JSON.stringify(patternObject) };
	}
}

interface SqsMessageAttributes : {
	messageId?: string;
	receiptHandle?: string;
	body?: any;
	attributes?: {
		ApproximateReceiveCount?: string;
		SentTimestamp?: string;
		SequenceNumber?: string;
		MessageGroupId?: string;
		SenderId?: string;
		MessageDeduplicationId?: string;
		ApproximateFirstReceiveTimestamp?: string;
	};
	messageAttributes?: any;
	md5OfBody?: string;
}

class PipeSqsFilterPattern extends PipeGenericFilterPattern {
	static fromSqsMessageAttributes(attributes: SqsMessageAttributes) :IPipeFilterPattern {
		return {
			pattern: JSON.stringify( attributes ),
		};

	}
}

Target

A Target is the end of the Pipe. After the payload from the source is pulled, filtered and enriched it is forwarded to the target.
For now the following targets are supported:

  • API destination
  • API Gateway
  • Batch job queue
  • CloudWatch log group
  • ECS task
  • Event bus in the same account and Region
  • Firehose delivery stream
  • Inspector assessment template
  • Kinesis stream
  • Lambda function (SYNC or ASYNC)
  • Redshift cluster data API queries
  • SageMaker Pipeline
  • SNS topic
  • SQS queue
  • Step Functions state machine
    • Express workflows (ASYNC)
    • Standard workflows (SYNC or ASYNC)

The CfnPipe resource reference the target only by their ARN. Right now there is no validation in der CDK framework that checks if a ARN is valid or not.
To overcome this shortcoming a PipeTarget class representing a target is needed. This PipeTarget is then implemented by all the supported targets.

The implementation is then similar to the Source implementation:

Example implementation

interface IPipeTarget {
	targetArn: string;
	targetParameters: CfnPipe.PipeTargetParametersProperty;
	
	grantPush(grantee: IRole): void;
};


export interface SqsTargetProps {
	queue: IQueue;
	sqsQueueParameters?: CfnPipe.PipeTargetSqsQueueParametersProperty;
}

export class SqsTarget implements IPipeTarget {
	private queue: IQueue;
	targetArn: string;
	targetParameters: CfnPipe.PipeTargetParametersProperty;

	constructor(props: SqsTargetProps) {
		this.queue = props.queue;
		this.targetArn = props.queue.queueArn;
		this.targetParameters = { sqsQueueParameters: props.sqsQueueParameters };
	}
	
	public grantPush(grantee: IRole): void {
		this.queue.grantSendMessages(grantee);
	}
}

Enrichment

In the enrichment step the filtered payloads can be used to invoke one of the following services

  • API destination
  • Amazon API Gateway
  • Lambda function
  • Step Functions state machine
    • only express workflow

The invocation is a synchron call to the service. The result of the enrichment step then can be used to combine it with the filtered payload to target.
The enrichment has two main properties for all types of supported services

  • enrichment ARN
  • input transformation

The enrichment ARN is the AWS resource ARN that should be invoked. The Role must have access to invoke this ARN.
The input transformation is used to map values from the filter step output to the input to the enrichment step.
For API destination and Api Gateway enrichments there can additional request parameter be set like header, query params. These properties can either be static or dynamic based on the payload from the previous step or extracted from the input transformation.

Example implementation

export abstract class PipeEnrichment {
	public readonly enrichmentArn: string;
	public enrichmentParameters: CfnPipe.PipeEnrichmentParametersProperty;
	
	constructor( enrichmentArn: string, props: CfnPipe.PipeEnrichmentParametersProperty) {
		this.enrichmentParameters = props;
		this.enrichmentArn = enrichmentArn;
	}
	
	abstract grantInvoke(grantee: IRole): void;
}

export class LambdaEnrichment extends PipeEnrichment {
	private lambda : IFunction;
	
	constructor(lambda: IFunction, props: {inputTransformation?: PipeInputTransformation}) {
		super(lambda.functionArn, { inputTemplate: props.inputTransformation?.inputTemplate });
		this.lambda = lambda;	
	}
	
	grantInvoke(grantee: IRole): void {
		this.lambda.grantInvoke(grantee);
	}
}

Input Transformation

Input transformations are used to transform or extend payloads to a desired structure. This transformation mechanism can be used prior to the enrichment or target step.

There are two types of mappings. Both types can be either static values or use values from the output of the previous step. Additionally there are a few values that come from the pipe itself (see reservedVariables enum).

  • string
    • static
    • dynamic
  • json
    • static
    • dynamic

Example implementation

enum reservedVariables {
	PIPES_ARN = '<aws.pipes.pipe-arn>',
	PIPES_NAME = '<aws.pipes.pipe-name>',
	PIPES_TARGET_ARN = '<aws.pipes.target-arn>',
	PIPE_EVENT_INGESTION_TIME = '<aws.pipes.event.ingestion-time>',
	PIPE_EVENT = '<aws.pipes.event>',
	PIPE_EVENT_JSON = '<aws.pipes.event.json>'
}

type StaticString = string;
type JsonPath = `<$.${string}>`;
type KeyValue = Record<string, string | reservedVariables>;
type StaticJsonFlat = Record<string, StaticString| JsonPath | KeyValue >;
type InputTransformJson = Record<string, StaticString| JsonPath | KeyValue | StaticJsonFlat>;

type PipeInputTransformationValue = StaticString | InputTransformJson

export interface IInputTransformationProps {
	inputTemplate: PipeInputTransformationValue;
} 

export class PipeInputTransformation {
	static fromJson(inputTemplate: Record<string, any>): PipeInputTransformation {
		return new PipeInputTransformation({ inputTemplate });
	} 

	readonly inputTemplate: string;

	constructor(props: IInputTransformationProps) {
		this.inputTemplate = JSON.stringify(props);
	}
}

Open Question

  1. The EventBridge L2 construct has a InputTransformation as well see cdk docs. Should this be reused/extended?
  2. Should there be specific InputTransformation helper that are specific to a source similar to the Source filter.

I am an engineer on the Pipes service team. Thanks for contributing this! I just came back from vacation and will allocate some time in the next couple of weeks to review this.

@nikp is there something I can do in the mean time? Should I start with a draft RFC Pull Request? Extend my PoC implementation?

@RaphaelManke I'm truly sorry for the delay. I did not mean to become a bottleneck. I've been unable to allocate time to this due to some other emergent issues. I am not on the CDK team, and am a customer of them also. Please follow the process they recommend - in the checklist I think a bar raiser needs to be assigned. That said a draft RFC seems like the right move too.

I don't want to overpromise and underdeliver again but I will get back to provide feedback as soon as I can.

it's been a month , is there any news about this ? seems stuck in stage 2 for a few months now

Yes there is πŸ˜ƒ @mrgrain got assigned as bar raiser. We will meet in the next weeks and discuss next steps.

After a kickoff this week with @mrgrain I created a PR for the RFC. #488

@RaphaelManke thanks for this just implemented CfnPipe this weekend.

The source (DynamoDB Stream) and target (EventBridge) were pretty easy to setup, I used the same setup with .grantRead and .grantPut on a custom role.

The filter was also quite easy, the thing that tripped me up for a bit was how to transform the target input to EventBridge. I could not find a way to dynamically set the source and event-detail from the event itself.

Would be helpful to understand what kind of default transformation Pipes is setting for each target.

I can provide some code later to explain better. Looking forward to having this as a L2.

commented

Hi @RaphaelManke ! Thanks for your contribution - an L2 construct for Pipes would be a great asset. @nikp asked me to take a look to not make you wait longer.

Would it be fair to separate the construct into several key concepts (which unsurprisingly match the Pipes stages)?

  1. Sources
  2. Filters
  3. Enrichers (really invoking an external function that can do more than enrich, e.g. it can also filter out events)
  4. Targets (with Input Filters)

Sources

I'll have to double-check but I believe all sources have batchSize and maximumBatchingWindowInSeconds parameters as those define the Pipes poling behavior. The remaining parameters are specific to the source. Would you think that it'd be possible (and useful) to separate these concepts?

Filters

The fields and expressions changing with the source is indeed a bit cumbersome (it also affects writing Enrichers) but I'd be worried that trying to wrap this into classes for each source could end up becoming a liability when Pipes supports more sources. EventBridge rules uses a common EventPatterns class for this reason.

Enrichment

Would you plan on providing wrappers for each kind (I saw Lambda in your repo, so I assume yes)?

Targets

I'd have to have a closer look but would favor reusing the Input Transformers from EB if possible. I'd be cautious again about tailoring them for each Target type fort the same reason as the sources.

I'd be happy to have a live chat also.

Sources

I'll have to double-check but I believe all sources have batchSize and maximumBatchingWindowInSeconds parameters as those define the Pipes poling behavior.

Nice πŸ˜ƒ didn't notice that yet. I updated the RFC PR to add this info.

The remaining parameters are specific to the source. Would you think that it'd be possible (and useful) to separate these concepts?

My idea would be that the Source class constructor provide these source specific attributes as constructor params.

Filters

The fields and expressions changing with the source is indeed a bit cumbersome (it also affects writing Enrichers) but I'd be worried that trying to wrap this into classes for each source could end up becoming a liability when Pipes supports more sources. EventBridge rules uses a common EventPatterns class for this reason.

Reusing would be a good idea, I am not sure if it matches 100 %.
I would at least build a generic class (or static class method) that can take the json input from the aws console so that the developer can use the pattern simulation there and copy and paste the result to its codebase. This also allows new sources.

A source specific filter class would be an addition to make creating these pattern easier.

Targets

Reusing existing Input Transformers would be very time efficient because this parts is the trickiest of all due to the <> syntax which is no json.

I am open to have a live chat πŸ˜ƒ you can reach me on the cdk slack or any other social media.

I just found the pipes settings section:
image

But I don't know how to configure these things in cloudformation
@mrgrain, @nikp
do you know or can get to know if this should be possible or is this config via cfn not possible right now?

Thanks for your great work and initiative @RaphaelMank

Legitimately my favorite PR of all time
πŸ‘€

Just wanted to give you an update:
I published my current progress on the pipes implementation on npmjs
https://www.npmjs.com/package/@raphaelmanke/aws-cdk-pipes-rfc
and here you can see some examples how to use the construct

https://github.com/RaphaelManke/aws-cdk-pipes-rfc-473/tree/main/e2e

I would be happy if you guys would play around with it and give feedback or report bugs.
I am mainly interested how you like the api and if it can be improved.

I am also happy to take PRs πŸ˜„

Note: this is a POC implementation and subject to change. I would not recommend it to use it in production.

@RaphaelManke thanks for putting this up, will try it out!

Is it possible to set the detail-type and source dynamically from the event? Could not find a way to do it using CfnPipe so maybe that's a hard limitation in Cloudformation.

"detail-type": <$.property>,
"source": <$.property>

Thanks for this example. I checked and now think understand your question.
Let me try to rephrase your question:

You want to use an value from a source/enrichment step as an value for the target invocation.
Lets say you have an order system which puts orders on a sqs queue and want to produce a an event on eventbridge for each order.

Source: SQS
Target: EventBridge eventbus

Source Event:

{ 
  "orderSystem" : "offline", 
  "orderId" : "Order-123"
 }

this will be a SQS message in the format:

{
  "messageId": "...",
  "receiptHandle": "...",
  "body": "{ \n  \"orderSystem\" : \"offline\", \n  \"orderId\" : \"Order-123\"\n }",
  "attributes": {
    "ApproximateReceiveCount": "...",
    "SentTimestamp": "....",
    "SenderId": "...",
    "ApproximateFirstReceiveTimestamp": "..."
  },
  "messageAttributes": {},
  "md5OfBody": "...",
  "eventSource": "...",
  "eventSourceARN": "...",
  "awsRegion": "..."
}

the target event should be in format:

{
    "version": "...",
    "id": "...",
    "detail-type": "newOrder", // <-- static string
    "source": "offline", // <-- dynamic from the source event
    "account": "...",
    "time": "...",
    "region": "...",
    "resources": [],
    "detail": {
          "orderId" : "Order-123" // <-- dynamic from the source event
    }
}

AFAIK this is not possible with the tools provided in the AWS console because it lacks the capability to set the targetParameters (it only shows them when they are set).
Untitled

But it is possible via the AWS API or cloudformation.

The solution requires two parts.
To set the detail object in the target you have to provide a inputTemplate.
For this example it would look like this

{
  "orderId" : <$.body.orderId>
}

which needs to be stringyfied to:

"{ \"orderId\" : <$.body.orderId>}"

The second part is the the pipe target takes an object called targetParameters which has a key eventBridgeEventBusParameters.
Here you can set the the other fields of an eventbridge target invocation like the detail-type field.

For this example the parameters look like this:

{
    "EventBridgeEventBusParameters": {
        "Source": "$.body.orderSystem",
        "DetailType": "newOrder"
    }
}

This is described in the docs.
And for example in the cloudformation docs you can see what can be set.

Although this docs are not very clear and misses examples.

In the CDK construct you can already use the targetParameters like this

const target = new EventBridgeEventBusTarget(targetEventBus, {
  source: '$.body.orderSystem',
  detailType: 'newOrder,
});

The inputTemplate is missing currently on the construct but will follow shortly (just forgot to add this property)

You got it right, interesting will test this out!

I just found the pipes settings section:
image

But I don't know how to configure these things in cloudformation
@mrgrain, @nikp
do you know or can get to know if this should be possible or is this config via cfn not possible right now?

Is this maybe just for the DynamoDB Stream Source?

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-pipes-pipe-pipesourcedynamodbstreamparameters.html

Would be great to have this construct!