Not able to use JsonPathReader
genehynson opened this issue · comments
Hi there - I'm running into issues using the JsonPathReader
controller service with the PutInfluxDatabaseRecord_2
processor. Can you help me identify if I'm doing something wrong or if this is a bug?
FlowFile content:
{"m":"m_val", "f1":"f1_val", "t1":"t1_val"}
PutInfluxDatabaseRecord_2 Settings:
Error:
nifi | java.lang.IllegalStateException: Cannot write FlowFile to InfluxDB because the required field 'f' is not present in Record.
nifi | at org.influxdata.nifi.processors.RecordToPointMapper.findRecordField(RecordToPointMapper.java:229)
nifi | at org.influxdata.nifi.processors.RecordToPointMapper.lambda$mapFields$0(RecordToPointMapper.java:133)
nifi | at java.util.ArrayList.forEach(ArrayList.java:1259)
nifi | at org.influxdata.nifi.processors.RecordToPointMapper.mapFields(RecordToPointMapper.java:131)
nifi | at org.influxdata.nifi.processors.RecordToPointMapper.mapRecord(RecordToPointMapper.java:108)
nifi | at org.influxdata.nifi.processors.RecordToPointMapper.mapRecordV2(RecordToPointMapper.java:102)
nifi | at org.influxdata.nifi.processors.internal.FlowFileToPointMapperV2.mapRecord(FlowFileToPointMapperV2.java:82)
nifi | at org.influxdata.nifi.processors.internal.AbstractFlowFileToPointMapper.mapInputStream(AbstractFlowFileToPointMapper.java:128)
nifi | at org.influxdata.nifi.processors.internal.AbstractFlowFileToPointMapper.mapFlowFile(AbstractFlowFileToPointMapper.java:85)
nifi | at org.influxdata.nifi.processors.internal.FlowFileToPointMapperV2.addFlowFile(FlowFileToPointMapperV2.java:75)
nifi | at org.influxdata.nifi.processors.PutInfluxDatabaseRecord_2.onTrigger(PutInfluxDatabaseRecord_2.java:169)
nifi | at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
nifi | at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1202)
nifi | at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
nifi | at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
nifi | at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
nifi | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
nifi | at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
nifi | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
nifi | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
nifi | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
nifi | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
nifi | at java.lang.Thread.run(Thread.java:748)
JsonTreeReader
controller service works fine but I'm wanting to use JsonPathReader
so I can query properties in a complex json object.
And according to this comment I believe this should be possible.
This feels like a pretty trivial example so I'm hoping someone can point out what I'm doing wrong - thanks!
NiFi Version: 1.14.0
fwiw I've also tried creating an Arvo schema and setting it in the Schema Text property of JsonPathReader
and got the same result. Here's the schema I used:
{
"type": "record",
"name": "mqtt_schema",
"namespace": "mqtt.nifi",
"doc:" : "ARVO schema for mqtt",
"fields": [
{ "name": "m", "type": "string" },
{ "name": "f1", "type": "string" },
{ "name": "t1", "type": "string" }
]
}
Hi @genehynson,
thanks for using our bundle.
Your configuration looks right...
Can you check how looks like incoming message into PutInfluxDatabaseRecord_2
?
Can you export and share your NiFi's flow?
Regards
Hi @genehynson, you are using wrong field names in your scheme. Scheme field names must correspond with fields you want to use in PutInfluxDatabaseRecord_2.
Correct schema is:
{
"type": "record",
"name": "mqtt_schema",
"namespace": "mqtt.nifi",
"doc:" : "ARVO schema for mqtt",
"fields": [
{ "name": "m", "type": "string" },
{ "name": "f", "type": "string" },
{ "name": "t", "type": "string" }
]
}
I assume that you want to convert field "f1" that from mqtt json message into field named "f" in influxdb.
Hi @rhajek thanks for the tip! yeah that did indeed solve the issue.
Question: Does the Influxdb bundle not support the "Infer schema" setting?
@genehynson, I have tested the "Infer schema" option with theJsonTreeReader
and it works fine for simple json without the additional configuration. You only need to specify in the PutInfluxDatabaseRecord_2
which record fields will be used as the influx measurement, fields and tags.
JsonPathReader
requires a JsonPath
expression for each field.
Hey @rhajek thanks for checking. I'm familiar with JsonTreeReader but want to use JsonPathReader so I can support more complex json objects.
Whenever I specify "infer schema" with JsonPathReader, all my FlowFiles are dropped. I believe I have everything configured correctly but please correct me if you see something wrong.
This is my PutInfluxDatabaseRecord_2 processor configuration:
This is my JsonPathReader configuration:
This is the message payload:
{"m":"m_val", "f1":"f1_val", "t1":"t1_val"}
And here is my flow definition:
testJson.json.zip
Hi @genehynson,
I did some testing/debugging around "Infer Schema" and I found that it works differently from our expectation. The schema is created automatically from incoming json and contains only fields that are present in the json (with the same name).
Adding a new property in JsonPathReader does not add the property into the schema. This explains, why only fields with the same name are correctly mapped into record fields. In your example an infer schema contains only fields "m", "f1", "t1"
. Mappings f -> $.f1
and t ->$.t1
do nothing and are ignored. Renaming of the fields is not possible with an infer schema.
There are several options how to workaround this:
- Use custom schema like
{
"type": "record",
"name": "mqtt_schema",
"namespace": "mqtt.nifi",
"doc:" : "ARVO schema for mqtt",
"fields": [
{ "name": "m", "type": "string" },
{ "name": "f", "type": "string" },
{ "name": "t", "type": "string" }
]
}
Then the mapping f -> $.f1
will work.
-
Use
JoltTransformJSON
processor to convert json into flat json with final field names. -
It depends on your use cases, may be custom written JSON transformation processor will be more flexible.
Ah very interesting. Thanks for investigating further! I think we'll go with option 1 - creating an arvo schema. Thanks again for the quick responses.