Axual / ksml

Kafka Streams for Low Code Environments

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Stream-Stream join issues

AlekseyLoktev83 opened this issue · comments

Hi!
I'm trying to implement stream-stream join with values of type Avro and getting following error:

2024-01-19T10:18:45,977Z ERROR io.axual.ksml.runner.KSMLRunner      An exception occurred while running KSML
io.axual.ksml.exception.KSMLTopologyException: Topology generation error: Error in topology: Join stream keyType is expected to be of type String
        at io.axual.ksml.operation.BaseOperation.topologyError(BaseOperation.java:101)
        at io.axual.ksml.operation.BaseOperation.checkType(BaseOperation.java:220)
        at io.axual.ksml.operation.BaseOperation.checkType(BaseOperation.java:215)
        at io.axual.ksml.operation.BaseOperation.checkType(BaseOperation.java:211)
        at io.axual.ksml.operation.JoinOperation.apply(JoinOperation.java:82)
        at io.axual.ksml.stream.KStreamWrapper.apply(KStreamWrapper.java:38)
        at io.axual.ksml.generator.TopologyGeneratorImpl.lambda$generate$0(TopologyGeneratorImpl.java:199)
        at java.base/java.util.HashMap.forEach(HashMap.java:1429)
        at io.axual.ksml.generator.TopologyGeneratorImpl.generate(TopologyGeneratorImpl.java:192)
        at io.axual.ksml.generator.TopologyGeneratorImpl.create(TopologyGeneratorImpl.java:96)
        at io.axual.ksml.KSMLTopologyGenerator.create(KSMLTopologyGenerator.java:63)
        at io.axual.ksml.runner.backend.KafkaBackend.<init>(KafkaBackend.java:100)
        at io.axual.ksml.runner.KSMLRunner.main(KSMLRunner.java:84)

My ksml pipeline:

streams:
  left_source:
    topic: ksml_sensordata_avro
    keyType: string
    valueType: avro:SensorData
  right_source:
    topic: ksml_sensordata_avro
    keyType: string
    valueType: avro:SensorData 
  join_result:
    topic: join_result
    keyType: string
    valueType: json
    


functions:
  joiner_func:
    type: valueJoiner
    code: |
      print('JOINING\n\t value1=' + str(value1) + '\n\t value2=' + str(value2))

      new_value={"left": value1, "right": value2 }
      print('JOINED sensordata=' + str(new_value))
    expression: new_value
    resultType: json

pipelines:
  join:
    from: left_source
    via:
      - name: joiner_funcing
        type: join
        stream: right_source
        valueJoiner: joiner_func
        window: 10s
    to: join_result

When i changed source topics to 'ksml_sensordata_json', and value types to 'string' - pipeline started and worked without error.

Ksml was build from main branch. I'm using kafka backend.

Is ksml able to join streams with values of avro type?

P.S.
It seems there is an typo in documentation about join operation: https://axual.github.io/ksml/operations.html#join.
Instead of parameter 'duration' i had to use parameter 'window'. With using 'duration' i got following error:
2024-01-19T10:42:22,889Z ERROR io.axual.ksml.execution.FatalError Description: Invalid value for parameter "timeDifference" (value was: null). It shouldn't be null.

I think, i've found an error. Line 82, other value type is checked with key type:

public StreamWrapper apply(KStreamWrapper input) {
checkNotNull(valueJoiner, VALUEJOINER_NAME.toLowerCase());
final var k = input.keyType();
final var v = input.valueType();
if (joinStream instanceof KStreamWrapper otherStream) {
final var vo = otherStream.valueType();
final var vr = streamDataTypeOf(firstSpecificType(valueJoiner, vo, v), false);
checkType("Join stream keyType", vo, equalTo(k));

Thanks for reporting! I can look into this tomorrow and will report back to you.

Fix is available in new MR

@jeroenvandisseldorp thank you for quick fix