synthetichealth / synthea

Synthetic Patient Population Simulator

Home Page:https://synthetichealth.github.io/synthea

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NullPointerException seen in various executions

arvindshmicrosoft opened this issue · comments

What happened?

I'm seeing various instances of NullPointerException in executions for different cities (1152 out of the 43944 cities in demographics.csv), each with a stack like one of the below examples. The code base I've built the JAR from corresponds to commit 6192ed7 (latest as of 2023-10-07).

Pattern 1

Here is one call stack pattern:

java.lang.NullPointerException: Cannot invoke "org.mitre.synthea.engine.State.clone()" because the return value of "java.util.Map.get(Object)" is null
	at org.mitre.synthea.engine.Module.process(Module.java:415)
	at org.mitre.synthea.engine.Module.process(Module.java:368)
	at org.mitre.synthea.engine.State$CallSubmodule.process(State.java:270)
	at org.mitre.synthea.engine.State.run(State.java:193)
	at org.mitre.synthea.engine.Module.process(Module.java:411)
	at org.mitre.synthea.engine.Module.process(Module.java:368)
	at org.mitre.synthea.engine.State$CallSubmodule.process(State.java:270)
	at org.mitre.synthea.engine.State.run(State.java:193)
	at org.mitre.synthea.engine.Module.process(Module.java:411)
	at org.mitre.synthea.engine.Module.process(Module.java:368)
	at org.mitre.synthea.engine.State$CallSubmodule.process(State.java:270)
	at org.mitre.synthea.engine.State.run(State.java:193)
	at org.mitre.synthea.engine.Module.process(Module.java:411)
	at org.mitre.synthea.engine.Module.process(Module.java:368)
	at org.mitre.synthea.engine.Generator.updatePerson(Generator.java:712)
	at org.mitre.synthea.engine.Generator.createPerson(Generator.java:668)
	at org.mitre.synthea.engine.Generator.generatePerson(Generator.java:478)
	at org.mitre.synthea.engine.Generator.lambda$run$3(Generator.java:383)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

Full (sample) log for a reasonable population Headland, Alabama.

A local repro under debugger with the above parameters shows that the NPE is thrown when trying to get the next state (which is Valve Surgery) when the current state is Priority_4_Next_Encounter_2. Maybe this gives you some clue?

Pattern 2

Now, there's a much more common, 2nd pattern. For Anchorage, Alaska, there are too many (112,495 exceptions) of these, but with a slightly different call stack:

java.lang.NullPointerException: Cannot invoke "org.mitre.synthea.world.concepts.Employment.checkEmployment(org.mitre.synthea.world.agents.Person, long)" because the return value of "java.util.Map.get(Object)" is null
	at org.mitre.synthea.modules.LifecycleModule.process(LifecycleModule.java:126)
	at org.mitre.synthea.engine.Generator.updatePerson(Generator.java:712)
	at org.mitre.synthea.engine.Generator.createPerson(Generator.java:668)
	at org.mitre.synthea.engine.Generator.generatePerson(Generator.java:478)
	at org.mitre.synthea.engine.Generator.lambda$run$3(Generator.java:383)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

full log for Anchorage, Alaska. Due to the large number of exceptions, out of the requested population (149,347) only 38,738 persons were generated.

A quick reproduction of this 2nd pattern seems to be possible with the following example:

java -Xmx10G -jar ~/synthea/build/libs/synthea-with-dependencies.jar --exporter.fhir.export false --exporter.csv.export true --exporter.baseDirectory '/var/data/tmp/Indiana Michigan City 6594' --generate.log_patients.detail none --exporter.clinical_note.export false -p 12217 Indiana "Michigan City"

From what I could gather on basic debugging, the call to person.attributes.get(Person.EMPLOYMENT_MODEL) here is returning null. Hopefully that is somewhat a clue.

Environment

OS:

Linux version 4.18.0-425.10.1.el8_7.x86_64 (mockbuild@dal1-prod-builder001.bld.equ.rockylinux.org) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-15) (GCC)) #1 SMP Thu Jan 12 16:32:13 UTC 2023

Java:

openjdk 11.0.20 2023-07-18 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.20.0.8-1) (build 11.0.20+8-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.20.0.8-1) (build 11.0.20+8-LTS, mixed mode, sharing)

Output of unzip -p synthea-with-dependencies.jar META-INF/MANIFEST.MF:

Manifest-Version: 1.0
Main-Class: App
Build-Timestamp: 2023-10-06T21:17:14.165+0000
Build-Version:
Created-By: Gradle 8.2.1
Build-Jdk: 11.0.20 (Red Hat, Inc. 11.0.20+8-LTS)
Build-OS: Linux amd64 4.18.0-477.27.1.el8_8.x86_64

I can see what is causing the issue for Pattern 2.

Michigan City, Indiana is in LaPorte County, at least according to our demographics.csv file. We don't have that county in our sdoh.csv file. The code I wrote to hedge against this case doesn't handle it properly.

A decent fix would be to check if the SDoH information is available. If it is not available for the county, set the probability of unemployment to a configurable value. I might be able to code it up later this week, but throwing it out there if anyone else wants to take a pass at it.

Thank you so much, @eedrummer !

Closing as fixes were merged to address both patterns. Please reopen if you see it happening again. Thank you for the clear and super helpful reproduction scenarios.