sksamuel / avro4s

Avro schema generation and serialization / deserialization for Scala

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use of GenericSerde results in "Method too large" error during scalac compilation

sabujkumarjena opened this issue · comments

` object Test {

def valueJoiner2
: ValueJoiner[(UnitFcyVisit, InvUnit), ArgoCarrierVisit, (UnitFcyVisit, InvUnit, ArgoCarrierVisit)] =
(value1: (UnitFcyVisit, InvUnit), value2: ArgoCarrierVisit) => (value1._1, value1._2, value2)

def joinUfvWithIuwithAcv(ufvWithIu: KTable[Key, (UnitFcyVisit, InvUnit)], acv: KTable[Key, ArgoCarrierVisit]) = {
{
ufvWithIu
.join(
acv,
(ufv: (UnitFcyVisit, InvUnit)) => Key(gkey = ufv._1.intend_ob_cv),
valueJoiner2,
Materialized.with(
new GenericSerdeKey,
new GenericSerde(UnitFcyVisit, InvUnit, ArgoCarrierVisit)
)
)

}

}

}

`
Getting following error during compilation
scalac: Error while emitting Test$
Method too large: Test$.joinUfvWithIuwithAcv (Lorg/apache/kafka/streams/scala/kstream/KTable;Lorg/apache/kafka/streams/scala/kstream/KTable;)Lorg/apache/kafka/streams/scala/kstream/KTable;

import com.sksamuel.avro4s.kafka.GenericSerde
import com.sksamuel.avro4s.{BinaryFormat}
case class UnitTerminalVisit(
unitFcyVisitIfo: UnitFcyVisit,
invUnitInfo: InvUnit,
intendObCvInfo: ArgoCarrierVisit = null,
actualIbCvInfo: ArgoCarrierVisit= null,
actualObCvInfo: ArgoCarrierVisit = null ,
lastPosBinInfo: SpatialBins= null
)
object Test {
val h = new GenericSerde [UnitTerminalVisit] (BinaryFormat)
}

Here in simple code snippet that also throws following error
scalac: Error while emitting Test$
Method too large: Test$. ()V

I am using
"com.sksamuel.avro4s" %% "avro4s-kafka" % "4.1.0",
scalaVersion := "2.13.10"
sbt.version=1.8.2
 

Here only UnitTerminalVisit is the nested case class and UnitFcyVisit, InvUnit, ArgoCarrierVisit are plane case classes whose fields are of primitive types or Option of primitive types.

Even I tried with giving explicit schema for intermediate case classes, I am getting same error.

**import com.sksamuel.avro4s.kafka.GenericSerde
import com.sksamuel.avro4s.{BinaryFormat}
case class UnitTerminalVisit(
unitFcyVisitIfo: UnitFcyVisit,
invUnitInfo: InvUnit,
intendObCvInfo: ArgoCarrierVisit = null,
actualIbCvInfo: ArgoCarrierVisit= null,
actualObCvInfo: ArgoCarrierVisit = null ,
lastPosBinInfo: SpatialBins= null
)
object Test {

implicit val ufvSchemaF = SchemaFor[UnitFcyVisit]
implicit val iuSchemaF = SchemaFor[InvUnit]
implicit val acvSchemaF = SchemaFor[ArgoCarrierVisit]
implicit val sbSchemaF = SchemaFor[SpatialBins]
implicit val utvSchemaF = SchemaFor[UnitTerminalVisit]

val h = new GenericSerde [UnitTerminalVisit] (BinaryFormat)

}**

Output:
scalac: Error while emitting Test$
Method too large: Test$.delayedEndpoint$Test$1 ()V

Same for me. Bump it.

@sabujkumarjena
Simply speaking method/class/object too large means that the max amount of field inside a specified scope reached. Which is 65535 I believe for java/scala.

To overcome the issue you have two choices:

  1. just take some of your implicit val definitions and put them into another object. This way you increase amount of object, each of them has its own max field limit;

Rewrite this:

object A {
  implicit val schemaFor: SchemaFor[A] = ???
  implicit val decoder: Decoder[A] = ???
  implicit val encoder: Encoder[A] = ???
}

into

object A {
  implicit val schemaFor: SchemaFor[A] = ???
}

object ACodecs {
  implicit val decoder: Decoder[A] = ???
  implicit val encoder: Encoder[A] = ???
}
  1. define the same implicit instances for child objects; This way all the functions for child fields would be put into children's companion objects. Parent object will simply import them and reuse;

Rewrite this:

case class Child1(???)
case class Child2(???)
case class A(child1: Child1, child2: Child2)

object A {
  implicit val schemaFor: SchemaFor[A] = ???
  implicit val decoder: Decoder[A] = ???
  implicit val encoder: Encoder[A] = ???
}

into

case class Child1(???)
case class Child2(???)
case class A(child1: Child1, child2: Child2)

object A {
  implicit val schemaFor: SchemaFor[A] = ???
  implicit val decoder: Decoder[A] = ???
  implicit val encoder: Encoder[A] = ???
}
object Child1 {
  implicit val schemaFor: SchemaFor[Child1] = ???
  implicit val decoder: Decoder[Child1] = ???
  implicit val encoder: Encoder[Child1] = ???
}
object Child {
  implicit val schemaFor: SchemaFor[Child2] = ???
  implicit val decoder: Decoder[Child2] = ???
  implicit val encoder: Encoder[Child2] = ???
}

These approaches work great together as well, you do not have to choose only one of them: feel free to use both at the same time.
In our org we have huuuge models and that was the solution.

@vkorchik I tried out your suggestion, was able to fix the issue somehow. It works but the models have become a much bigger hassle.

I wrapped my various case classes into separate objects.

object smbreak1 { case class Smbreak1( activateDeadLetterAfter: Option[ActivateDeadLetter], activateDeadLetterHeaderAfter: Option[ActivateDeadLetterHeader], ) }

Is there any better idea ? or any new implementation which i am not aware of ?

cc: @sabujkumarjena

@anmol-hpe , what do you mean by "much bigger hassle"?

@vkorchik consider that i have over 80 tables in my Database named 'X'. Each of the table consists of more than 30-40 fields on an average.

From what i understood, i will have to wrap each table separately in a different object. Kindly correct me if my understanding is incorrect ?

I tried this out with less number of tables, it worked fine. Basically i am trying to create avroSchemas from such case classes.

Are there any constraints in this scenario which i am not aware of ?

From what i understood, i will have to wrap each table separately in a different object.

@anmol-hpe , Depends on what you mean by 'table'.

Idea is pretty simple: I you see this error - class/method too large - you have too many things defined in the "scope". By scope I rather mean some "object" (but it also could be class/method/etc).

Take a look at my example above: I have not wrapped my case classes with any object. The only thing I have done is defined its own instances of decoder/encoder/schemaFor for both parent and child objects. So there should be no hassle at all, except that you will have not 1 companion object with just parent instances, but couple of objects.
How it works under the hood (afaik), is that parent object will be deriving (read "generating") all the underlying types in its companion object, but will reuse the already defined instances of the underlying types from companion objects of underlying types (or any place you have defined them and provided the import). This way instead of defining thousands of methods of all fields of all types in one object, one splits definitions by couple of objects and reuse them as much as possible.
Java limit for number of fields is 65535 per method/class/object so by defining those instances in several objects, one decreases chances of hitting max size of java class.

Analyze your types, and find out the relations between classes. For big and repeatable once define its own decoder/encoder/schemaFor.
If your types are chained in a way BigClass1 -> BigClass2 -> BigClass3 -> ... -> BigClassN and none of them has its own instances defined except BigClass1, then the instances for all dependent classes will be also defined in the scope of BigClass1 object, which will increase the amount of fields drastically. So, define instances for dependent classes as well, for those big ones especially (it especially works for Scala3).

@vkorchik

The only thing I have done is defined its own instances of decoder/encoder/schemaFor for both parent and child objects.

Yes, I agree.

okay, got it. Thanks will work in this direction.