kafka-ops / julie

A solution to help you build automation and gitops in your Apache Kafka deployments. The Kafka gitops!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Handling of pagination of service accounts in Confluent Cloud broken

maxschorn opened this issue · comments

Hi, we saw an error when executing Julie in our environment and after some investigation we found a bug that causes it.

Describe the bug
We get the below error when executing Julie:

com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "errors" (class com.purbon.kafka.topology.api.ccloud.response.ListServiceAccountResponse), not marked as ignorable (4 known properties: "data", "api_version", "kind", "metadata"])
 at [Source: (String)"***
  "errors": [
    ***
      "id": "xxx",
      "status": "400",
      "detail": "failed to base64 decode page cursor",
      "source": ***
    ***
  ]
***"; line: 2, column: 14] (through reference chain: com.purbon.kafka.topology.api.ccloud.response.ListServiceAccountResponse["errors"])
  at com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61)
  at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:987)
  at com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:1974)
  at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1701)
  at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1679)
  at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:330)
  at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
  at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322)
  at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4593)
  at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3548)
  at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3516)
  at com.purbon.kafka.topology.utils.JSON.toObject(JSON.java:53)
  at com.purbon.kafka.topology.api.ccloud.CCloudApi.getListServiceAccounts(CCloudApi.java:162)
  at com.purbon.kafka.topology.api.ccloud.CCloudApi.listServiceAccounts(CCloudApi.java:121)
  at com.purbon.kafka.topology.serviceAccounts.CCloudPrincipalProvider.listServiceAccounts(CCloudPrincipalProvider.java:27)
  at com.purbon.kafka.topology.AbstractPrincipalManager.printCurrentState(AbstractPrincipalManager.java:112)
  at com.purbon.kafka.topology.JulieOps.run(JulieOps.java:217)
  at com.purbon.kafka.topology.JulieOps.run(JulieOps.java:227)
  at com.purbon.kafka.topology.CommandLineInterface.processTopology(CommandLineInterface.java:212)
  at com.purbon.kafka.topology.CommandLineInterface.run(CommandLineInterface.java:161)
  at com.purbon.kafka.topology.CommandLineInterface.main(CommandLineInterface.java:147)

We turned on debug logs and saw the following calls being made by Julie:

...
[DEBUG] 2022-08-19 09:35:12.405 [main] JulieHttpClient - method: GET response: (GET https://api.confluent.cloud/iam/v2/service-accounts?page_size=100) 200
[DEBUG] 2022-08-19 09:35:12.421 [main] JulieHttpClient - method: GET request.uri: https://api.confluent.cloud/iam/v2/service-accounts?page_token=<page_token>?page_size=100
...

We found out, that we had more than 100 service accounts in our Confluent Cloud org, which leads to the response having a 'next' field in the 'metadata' block which Julie then tries to fetch again.

The constructed URL is malformed which leads to the above mentioned error.
The URL is constructed with two '?' parameters instead of one '?' followed by '&'.

We can not increase it any further as 100 is the maximum page size of the API.

I guess this part is the cause of it:

url = nextUrl.replace(ccloudApiBaseUrl, "");

To Reproduce
Steps to reproduce the behavior:

  1. Have a number of service accounts available in Confluent Cloud and set ccloud.service_account.query.page.size to an int which leads to multiple pages when listing service accounts
  2. provide a topology which creates or clears some bindings so that listServiceAccounts gets called
  3. Following additional config:
security.protocol=SASL_SSL
ssl.endpoint.identification.algorithm=https
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
  username="<CLUSTER_API_KEY>" \
  password="<CLUSTER_API_SECRET>";
ccloud.environment=<CLOUD_ENVIRONMENT>
ccloud.cluster.api.key=<CLUSTER_API_KEY>
ccloud.cluster.api.secret=<CLUSTER_API_SECRET>
ccloud.cloud.api.key=<CLOUD_API_KEY>
ccloud.cloud.api.secret=<CLOUD_API_SECRET>
topology.builder.ccloud.kafka.cluster.id=<CLUSTER_ID>
ccloud.cluster.url=<CLUSTER_REST_URL>
topology.builder.access.control.class = com.purbon.kafka.topology.roles.CCloudAclsProvider
ccloud.service_account.translation.enabled=false
julie.verify.remote.state=true
julie.http.retry.times=20
julie.http.retry.backoff.time.ms=30000
  1. Run Julie
  2. Julie will fail when it tries to list service accounts

Expected behavior
Julie should correctly deal with a paginated response from Confluent Cloud REST API v2

Runtime (please complete the following information):

  • Julie jar v4.2.5
  • Java 11
  • Confluent Cloud cluster

Thanks a lot for reporting this @maxschorn, I was able to reproduce the error and this is going to be fixed with the upcoming release.