jpmml / jpmml-lightgbm

Java library and command-line application for converting LightGBM models to PMML

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fail to convert lightgbm to pmml (NumberFormatException)

azaaza0319 opened this issue · comments

tried to run java -jar jpmml-lightgbm-executable-1.2-SNAPSHOT.jar --lgbm-input lightgbm.txt --pmml-output output.pmml to convert lightgbm model to pmml format, but encountered

Exception in thread "main" java.lang.NumberFormatException: null
	at java.lang.Integer.parseInt(Integer.java:542)
	at java.lang.Integer.parseInt(Integer.java:615)
	at org.jpmml.lightgbm.Section.getInt(Section.java:51)
	at org.jpmml.lightgbm.Tree.load(Tree.java:75)
	at org.jpmml.lightgbm.GBDT.load(GBDT.java:111)
	at org.jpmml.lightgbm.LightGBMUtil.loadGBDT(LightGBMUtil.java:59)
	at org.jpmml.lightgbm.LightGBMUtil.loadGBDT(LightGBMUtil.java:51)
	at org.jpmml.lightgbm.Main.run(Main.java:124)
	at org.jpmml.lightgbm.Main.main(Main.java:117)

Attached is the lightgbm model file.
lightgbm.txt

lightgbm version is 2.0.2.

Can someone please kindly help? Thanks much!

lightgbm version is 2.0.2.

LightGBM 2.0.2 is such an outdated version (nearly two years old). Have you tried any newer LightGBM versions such as 2.1.2 or 2.2.2?

Also, how was the model trained? Using standalone LightGBM API/command-line application, or using Scikit-Learn wrapper. There might be difference between frontends.

Anyway, the JPMML-LightGBM includes integration tests, and they are all passing cleanly:
https://github.com/jpmml/jpmml-lightgbm/blob/master/src/test/resources/main.py

What are you doing differently?

Thanks @vruusmann . The model is trained and maintained by others, so it is not likely to be re-trained under a newer LightGBM version.
And the model was trained by using python Training API (not scikit-learn wrapper).

Additionally, I tried to add num_cat=0 for each tree and re-converted it, but got another error

Exception in thread "main" java.lang.NullPointerException
	at org.jpmml.lightgbm.Tree.selectValues(Tree.java:225)
	at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:151)
	at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:186)
	at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:187)
	at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:186)
	at org.jpmml.lightgbm.Tree.encodeTreeModel(Tree.java:94)
	at org.jpmml.lightgbm.ObjectiveFunction.createMiningModel(ObjectiveFunction.java:66)
	at org.jpmml.lightgbm.BinomialLogisticRegression.encodeMiningModel(BinomialLogisticRegression.java:49)
	at org.jpmml.lightgbm.GBDT.encodeMiningModel(GBDT.java:287)
	at org.jpmml.lightgbm.GBDT.encodePMML(GBDT.java:276)
	at org.jpmml.lightgbm.Main.run(Main.java:131)
	at org.jpmml.lightgbm.Main.main(Main.java:117)

Do you have any suggestions? Thanks!

I tried to add num_cat=0 for each tree and re-converted it,

The expression num_cat = 0 suggests that the model does not specify any categorical splits, but as the stack trace shows, this suggestion is wrong.

I would advise checking out some older JPMML-LightGBM version (something from April-May 2017, such as tags 1.0.7, 1.0.8 or 1.0.9), and try to "hack" these. The idea is that the current codebase follows the thought of LightGBM 2.2.X, and is too complicated for early models.

Got it. Thanks! Will take a look at the previous versions.