strimzi / strimzi-kafka-bridge

An HTTP bridge for Apache Kafka®

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Logged HTTP response status code could be different from the actual one returned to the client

ppatierno opened this issue · comments

Most of the calls in the bridge are executed asynchronously through Vert.x worker threads.
The only one which are more synchronous are the creation and deletion of a consumer, because they just involve create a new instance of a Kafka consumer and closing that instance.
Said that, the HttpOpenApiOperation class provides a central place for logging HTTP request and response:

public void handle(RoutingContext routingContext) {
        this.logRequest(routingContext);
        this.process(routingContext);
        this.logResponse(routingContext);
    }

When the process is implemented differently for each OpenAPI operation (i.e. send, poll, consumer creation, deletion, commit and so on).
The process method is where asynchronous things happen so the next logResponse could be executed when the process is not ended yet and it's logging misleading HTTP code and message.
As an example ...

[2022-12-16 10:58:05,687] INFO  <eateConsumer:95> [oop-thread-1] [1477153321] CREATE_CONSUMER Request: from 0:0:0:0:0:0:0:1:60676, method = POST, path = /consumers/my-group
[2022-12-16 10:58:05,701] INFO  <idgeEndpoint:145> [oop-thread-1] Created consumer my-consumer in group my-group
[2022-12-16 10:58:05,702] INFO  <HttpUtils   :33> [oop-thread-1] ***** [1477153321] Response: statuscode 200 message OK
[2022-12-16 10:58:05,704] INFO  <eateConsumer:95> [oop-thread-1] [1477153321] CREATE_CONSUMER Response:  statusCode = 200, message = OK
[2022-12-16 10:58:50,594] INFO  <leteConsumer:95> [oop-thread-1] [258152251] DELETE_CONSUMER Request: from 0:0:0:0:0:0:0:1:60676, method = DELETE, path = /consumers/my-group/instances/my-consumer
[2022-12-16 10:58:50,594] INFO  <idgeEndpoint:516> [oop-thread-1] HttpSinkBridgeEndpoint handle thread Thread[vert.x-eventloop-thread-1,5,main]
[2022-12-16 10:58:50,595] INFO  <idgeEndpoint:251> [oop-thread-1] Deleted consumer my-consumer from group my-group
[2022-12-16 10:58:50,595] INFO  <HttpUtils   :33> [oop-thread-1] ***** [258152251] Response: statuscode 204 message No Content
[2022-12-16 10:58:50,595] INFO  <leteConsumer:95> [oop-thread-1] [258152251] DELETE_CONSUMER Response:  statusCode = 204, message = No Content

The logged part with the asterisk was added by me to raise this issue, in the place where the actual response is sent back to the HTTP client. As you can see on creation and deletion it happens before the response sent back by the above method.

But let's consider another call, for example the subscribe ...

[2022-12-16 10:59:33,924] INFO  <subscribe   :95> [oop-thread-1] [1723390097] SUBSCRIBE Request: from 0:0:0:0:0:0:0:1:60676, method = POST, path = /consumers/my-group/instances/my-consumer/subscription
[2022-12-16 10:59:33,925] INFO  <idgeEndpoint:516> [oop-thread-1] HttpSinkBridgeEndpoint handle thread Thread[vert.x-eventloop-thread-1,5,main]
[2022-12-16 10:59:33,928] INFO  <idgeEndpoint:157> [oop-thread-1] Subscribe to topics [SinkTopicSubscription(topic=my-topic,partition=null)]
[2022-12-16 10:59:33,930] INFO  <subscribe   :95> [oop-thread-1] [1723390097] SUBSCRIBE Response:  statusCode = 200, message = OK
[2022-12-16 10:59:33,931] INFO  <afkaConsumer:968> [mer-thread-1] [Consumer clientId=my-consumer, groupId=my-group] Subscribed to topic(s): my-topic
[2022-12-16 10:59:33,933] INFO  <idgeEndpoint:396> [oop-thread-1] Subscribe handler thread Thread[vert.x-eventloop-thread-1,5,main]
[2022-12-16 10:59:33,934] INFO  <HttpUtils   :33> [oop-thread-1] ***** [1723390097] Response: statuscode 204 message No Content

From the asterisked log you can see the real 204 code sent back to the HTTP client, while our official log is printing 200 even before the Subscribe handler is executed (so the async subscribe call ended).

We should fix this taking into account the asynchronous behaviour and logging the actual HTTP response when it's the right time.