zalando-incubator / clin

Cli for Nakadi for event types and subscriptions management

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Repartitioning via nakadi SQL does not work. Key is in wrong place in payload.

barry-hennessy opened this issue · comments

Expected Behavior

Creating a nakadi SQL query with repartition_parameters is possible when repartitioning is set in the spec (within outputEventType).

Actual Behavior

The repartitioning dict is successfully parsed, but output in the wrong place in the body of the POST to create the SQL query.

According to the Nakadi SQL documentation repartition_parameters is a key within output_event_type, not a top level key itself.

I've hacked a patch in to make the change and can confirm that the following change in JSON output works:

 {
   "id": "event-name.multiple-partitions",
   "sql": "SELECT * FROM \"event-name\" AS a",
   "envelope": false,
   "output_event_type": {
     "name": "event-name.multiple-partitions",
     "owning_application": "app
     "category": "business",
     "audience": "component-internal",
     "cleanup_policy": "delete",
-    "retention_time": 86400000
-  },
-  "repartition_parameters": {
-    "number_of_partitions": 6,
-    "partition_strategy": "hash",
-    "partition_key_fields": [
-      "config_sku"
-    ]
+    "retention_time": 86400000,
+    "repartition_parameters": {
+      "number_of_partitions": 6,
+      "partition_strategy": "hash",
+      "partition_key_fields": [
+        "config_sku"
+      ]
+    }
   }
}

Steps to Reproduce the Problem

  1. Add a repartitioning key to an event like so:
diff --git a/docs/examples/single/sql-query.yaml b/docs/examples/single/sql-query.yaml
index 5c2d0f2..7f5932b 100644
--- a/docs/examples/single/sql-query.yaml
+++ b/docs/examples/single/sql-query.yaml
@@ -13,6 +13,10 @@ spec:
    cleanup:
      policy: delete
      retentionTimeDays: 2
+    repartitioning:
+      partitionCount: 6
+      strategy: hash
+      keys: ["important_key"]
  auth:
    users:
      admins:

Side note: This may be good to add to the documentation/examples. It wasn't immediately clear how to specify this.

  1. Apply the spec
clin apply -e ... -t ... docs/examples/single/sql-query.yaml -v -p -X
  1. Note the location of repartition_parameters

  2. The event will be created, but with the default number of parameters, not the ones you specified in repartitioning.

Specifications

  • Version: clin, version 1.3.0
  • Platform: mac osX 10.15.7
  • Subsystem: -

The following change likely fixes the problem. Python is not my day-to-day language, so there may be a better/more pythonic way.

diff --git a/clin/clients/nakadi_sql.py b/clin/clients/nakadi_sql.py
index 46f7b80..8714a0a 100644
--- a/clin/clients/nakadi_sql.py
+++ b/clin/clients/nakadi_sql.py
@@ -120,7 +120,7 @@ def sql_query_to_payload(sql_query: SqlQuery) -> dict:
     }
 
     if sql_query.output_event_type.repartitioning:
-        payload["repartition_parameters"] = {
+        payload["output_event_type"]["repartition_parameters"] = {
             "number_of_partitions": sql_query.output_event_type.repartitioning.partition_count,
             "partition_strategy": str(
                 sql_query.output_event_type.repartitioning.strategy

The bug should be fixed in version 1.4.1. In case of further problems, please feel free to reopen this issue again.