influxdata / nifi-influxdb-bundle

InfluxDB Processors For Apache NiFi

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not able to use JsonPathReader

genehynson opened this issue · comments

Hi there - I'm running into issues using the JsonPathReader controller service with the PutInfluxDatabaseRecord_2 processor. Can you help me identify if I'm doing something wrong or if this is a bug?

FlowFile content:
{"m":"m_val", "f1":"f1_val", "t1":"t1_val"}

Flow:
image

PutInfluxDatabaseRecord_2 Settings:
Screen Shot 2022-01-04 at 3 26 12 PM

JsonPathReader settings:
image

Error:

nifi          | java.lang.IllegalStateException: Cannot write FlowFile to InfluxDB because the required field 'f' is not present in Record.
nifi          | 	at org.influxdata.nifi.processors.RecordToPointMapper.findRecordField(RecordToPointMapper.java:229)
nifi          | 	at org.influxdata.nifi.processors.RecordToPointMapper.lambda$mapFields$0(RecordToPointMapper.java:133)
nifi          | 	at java.util.ArrayList.forEach(ArrayList.java:1259)
nifi          | 	at org.influxdata.nifi.processors.RecordToPointMapper.mapFields(RecordToPointMapper.java:131)
nifi          | 	at org.influxdata.nifi.processors.RecordToPointMapper.mapRecord(RecordToPointMapper.java:108)
nifi          | 	at org.influxdata.nifi.processors.RecordToPointMapper.mapRecordV2(RecordToPointMapper.java:102)
nifi          | 	at org.influxdata.nifi.processors.internal.FlowFileToPointMapperV2.mapRecord(FlowFileToPointMapperV2.java:82)
nifi          | 	at org.influxdata.nifi.processors.internal.AbstractFlowFileToPointMapper.mapInputStream(AbstractFlowFileToPointMapper.java:128)
nifi          | 	at org.influxdata.nifi.processors.internal.AbstractFlowFileToPointMapper.mapFlowFile(AbstractFlowFileToPointMapper.java:85)
nifi          | 	at org.influxdata.nifi.processors.internal.FlowFileToPointMapperV2.addFlowFile(FlowFileToPointMapperV2.java:75)
nifi          | 	at org.influxdata.nifi.processors.PutInfluxDatabaseRecord_2.onTrigger(PutInfluxDatabaseRecord_2.java:169)
nifi          | 	at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
nifi          | 	at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1202)
nifi          | 	at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
nifi          | 	at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
nifi          | 	at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
nifi          | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
nifi          | 	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
nifi          | 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
nifi          | 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
nifi          | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
nifi          | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
nifi          | 	at java.lang.Thread.run(Thread.java:748)

JsonTreeReader controller service works fine but I'm wanting to use JsonPathReader so I can query properties in a complex json object.

And according to this comment I believe this should be possible.

This feels like a pretty trivial example so I'm hoping someone can point out what I'm doing wrong - thanks!

NiFi Version: 1.14.0

fwiw I've also tried creating an Arvo schema and setting it in the Schema Text property of JsonPathReader and got the same result. Here's the schema I used:

{
  "type": "record",
  "name": "mqtt_schema",
  "namespace": "mqtt.nifi",
  "doc:" : "ARVO schema for mqtt",
  "fields": [
    { "name": "m", "type": "string" },
    { "name": "f1", "type": "string" },
    { "name": "t1", "type": "string" }
  ]
}

Hi @genehynson,

thanks for using our bundle.

Your configuration looks right...

Can you check how looks like incoming message into PutInfluxDatabaseRecord_2?
Can you export and share your NiFi's flow?

Regards

Hi @genehynson, you are using wrong field names in your scheme. Scheme field names must correspond with fields you want to use in PutInfluxDatabaseRecord_2.

Correct schema is:

{
  "type": "record",
  "name": "mqtt_schema",
  "namespace": "mqtt.nifi",
  "doc:" : "ARVO schema for mqtt",
  "fields": [
    { "name": "m", "type": "string" },
    { "name": "f", "type": "string" },
    { "name": "t", "type": "string" }
  ]
}

I assume that you want to convert field "f1" that from mqtt json message into field named "f" in influxdb.

Hi @rhajek thanks for the tip! yeah that did indeed solve the issue.

Question: Does the Influxdb bundle not support the "Infer schema" setting?

@genehynson, I have tested the "Infer schema" option with theJsonTreeReader and it works fine for simple json without the additional configuration. You only need to specify in the PutInfluxDatabaseRecord_2 which record fields will be used as the influx measurement, fields and tags.

JsonPathReader requires a JsonPath expression for each field.

image

Hey @rhajek thanks for checking. I'm familiar with JsonTreeReader but want to use JsonPathReader so I can support more complex json objects.

Whenever I specify "infer schema" with JsonPathReader, all my FlowFiles are dropped. I believe I have everything configured correctly but please correct me if you see something wrong.

This is the error I get:
image

This is my PutInfluxDatabaseRecord_2 processor configuration:
Screen Shot 2022-01-06 at 8 48 43 AM

This is my JsonPathReader configuration:
image

This is the message payload:

{"m":"m_val", "f1":"f1_val", "t1":"t1_val"}

And here is my flow definition:
testJson.json.zip

Hi @genehynson,

I did some testing/debugging around "Infer Schema" and I found that it works differently from our expectation. The schema is created automatically from incoming json and contains only fields that are present in the json (with the same name).

Adding a new property in JsonPathReader does not add the property into the schema. This explains, why only fields with the same name are correctly mapped into record fields. In your example an infer schema contains only fields "m", "f1", "t1". Mappings f -> $.f1 and t ->$.t1 do nothing and are ignored. Renaming of the fields is not possible with an infer schema.

There are several options how to workaround this:

  1. Use custom schema like
{
  "type": "record",
  "name": "mqtt_schema",
  "namespace": "mqtt.nifi",
  "doc:" : "ARVO schema for mqtt",
  "fields": [
    { "name": "m", "type": "string" },
    { "name": "f", "type": "string" },
    { "name": "t", "type": "string" }
  ]
}

Then the mapping f -> $.f1 will work.

  1. Use JoltTransformJSON processor to convert json into flat json with final field names.

  2. It depends on your use cases, may be custom written JSON transformation processor will be more flexible.

Ah very interesting. Thanks for investigating further! I think we'll go with option 1 - creating an arvo schema. Thanks again for the quick responses.