smallrye / jandex

Java Annotation Indexer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Large classes result in IndexOutOfBoundsException

micheljung opened this issue · comments

I have a generated Kotlin data class that looks like this:

public data class Skills(
  public val foo: String,
  ...
  [262 properties in total]
  ...
  public val bar: Int
)

This leads to an IndexOutOfBoundsException here:
image

https://github.com/wildfly/jandex/blob/1f83eb3a9d235ad72b69bbb144f62113750b1667/src/main/java/org/jboss/jandex/Indexer.java#L1667-L1678

Caused by: java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 1675
	at org.jboss.jandex.Indexer.decodeUtf8EntryAsBytes(Indexer.java:1661)
	at org.jboss.jandex.Indexer.processMethodInfo(Indexer.java:335)
	at org.jboss.jandex.Indexer.index(Indexer.java:1965)
	at io.quarkus.deployment.steps.ApplicationIndexBuildStep$1.visitFile(ApplicationIndexBuildStep.java:38)
	at io.quarkus.deployment.steps.ApplicationIndexBuildStep$1.visitFile(ApplicationIndexBuildStep.java:27)
	at java.base/java.nio.file.Files.walkFileTree(Files.java:2804)
	at java.base/java.nio.file.Files.walkFileTree(Files.java:2876)
	at io.quarkus.deployment.steps.ApplicationIndexBuildStep.build(ApplicationIndexBuildStep.java:27)
	... 11 more

OK, got a reproducer. A data class with 255 properties (all of type String) is fine -- adding 1 more property (so that total is 256 properties) is enough to trigger the error.

So the problem is that a constructor has 256 parameters (or more) and each parameter is annotated (@NotNull). The bytecode format of the RuntimeInvisibleParameterAnnotations method attribute assumes that a method may have at most 255 annotated parameters, because it stores the number of annotated method parameters as a single byte (https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.7.19). If you convert 256 to byte, you get the value 0. When Jandex encounters that 0, it believes that there are no annotated parameters, so it moves on to parse another method, even though the input stream still points to the method parameter annotation array, which has 256 entries. Everything else that follows is wrong.

(In case of 262 parameters, the byte value would not be 0, but say 6, but that doesn't make a difference. Jandex would read 6 parameter annotations and believe that it's done with the method and another method follows.)

Now, we actually know how many parameters the method has, because we've read its signature, so maybe we don't have to trust that single-byte number. Unfortunately, the JVM specification is very specific about this in the description of the num_parameters attribute:

There is no assurance that this number is the same as the number of parameter descriptors in the method descriptor.

The description of the parameter_annotations[] table also says:

The i'th entry in the parameter_annotations table may, but is not required to, correspond to the i'th parameter descriptor in the method descriptor (§4.3.3).

Finally, the specification of method descriptors says:

A method descriptor is valid only if it represents method parameters with a total length of 255 or less, where that length includes the contribution for this in the case of instance or interface method invocations. The total length is calculated by summing the contributions of the individual parameters, where a parameter of type long or double contributes two units to the length and a parameter of any other type contributes one unit.

That is, the JVM specification prohibits instance methods with more than 254 parameters, and static methods with more than 255 parameters (assuming no long or double parameters -- with those present, the number is even smaller). The javac will refuse to compile such methods, while Kotlin will happily generate invalid bytecode.

On the other hand, Jandex has always been operating under the assumption that each item in the parameter_annotations table corresponds to a method parameter with the same index. (Which is a lot stricter than the previously quoted provision allows: "The i'th entry in the parameter_annotations table may, but is not required to, correspond to the i'th parameter descriptor in the method descriptor".) It seems most real-world .class generators have been providing this much stricter guarantee.

So I guess that if the number of parameters in method descriptor is higher than 255, there's no harm in ignoring the single-byte parameter count and using the count from the method descriptor.

Submitted a fix here: #157

I also created an issue for Kotlin: https://youtrack.jetbrains.com/issue/KT-49426

Fixed in #157.

@Ladicek wow, good find. So wrt to Jandex's assumption. There has been some improvement there (some done, some in the works) #45 (comment)

I see there's also #48, but both #45 and #48 seem to be only superficially related. IIUC, #45 and #48 are about finding (and filtering out) synthetic parameters, while the issue here is about the mapping between parameter index and parameter annotation index. Perhaps they are more closely related than I can see now :-) I'll need to take a look at when synthetic parameters are generated and what does that do to annotated parameters.