py4j / py4j

Py4J enables Python programs to dynamically access arbitrary Java objects

Home Page:https://www.py4j.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to reflect a Constructor of parameters

a49a opened this issue · comments

I have failed in all three ways to reflect a constructor of Java Class.

One python usage

class_name = 'org.apache.flink.connector.pulsar.sink.writer.router.KeyHashTopicRouter'
context_classloader = gateway.jvm.Thread.currentThread().getContextClassLoader()
clazz = context_classloader.loadClass(class_name)
clazz.getConstructors()[0].newInstance()

Will throw Py4JException: Method newInstance([]) does not exist

Another one python usage

JSinkConfiguration = get_gateway().jvm.org.apache.flink.connector.pulsar.sink.config.SinkConfiguration
j_sink_conf = JSinkConfiguration()
clazz.getConstructors()[1].newInstance(j_sink_conf)

Will throw Py4JException: Method newInstance([class org.apache.flink.connector.pulsar.sink.config.SinkConfiguration]) does not exist

Finally usage

def load_java_class(class_name):
    gateway = get_gateway()
    context_classloader = gateway.jvm.Thread.currentThread().getContextClassLoader()
    return context_classloader.loadClass(class_name)

j_topic_router = load_java_class(class_name).getConstructor(
    load_java_class('org.apache.flink.connector.pulsar.sink.config.SinkConfiguration')
).newInstance(sink_configuration._j_sink_configuration)

Will throw Py4JException: Method getConstructor([class java.lang.Class]) does not exist

The Java Class

public class KeyHashTopicRouter<IN> implements TopicRouter<IN> {
    private static final long serialVersionUID = 2475614648095079804L;

    private final MessageKeyHash messageKeyHash;

    public KeyHashTopicRouter() {
        this.messageKeyHash = null;
    }

    public KeyHashTopicRouter(SinkConfiguration sinkConfiguration) {
        this.messageKeyHash = sinkConfiguration.getMessageKeyHash();
    }
}

Directly calling newInstance works, but I want to pass parameters.

clazz.newInstance()

I think the problem is that the Constructor.newInstance and Class.getConstructor Java methods use variable arguments / varargs, but Py4J doesn't have a convenient syntax for calling them. This issue was raised previously in #252, an old PR which proposed to add support for this.

Workaround:

The underlying JVM type signature of the vararg method expects a Java array containing the varargs, so you can construct a Java array and pass it to that method.

Here's an example of this:

import py4j

# Replace with however you normally obtain the JVM / gateway in your application:
jvm = sc._jvm
gateway = sc._gateway

# Code below is effectively equivalent to calling `new java.lang.StringBuffer("myString")` via Java reflection:

context_classloader = jvm.Thread.currentThread().getContextClassLoader()
sb_class = context_classloader.loadClass("java.lang.StringBuffer")
str_class = context_classloader.loadClass("java.lang.String")

get_constructor_args = gateway.new_array(jvm.java.lang.Class, 1)
get_constructor_args[0] = str_class

constructor = sb_class.getConstructor(get_constructor_args)

new_instance_args = gateway.new_array(jvm.java.lang.Object, 1)
new_instance_args[0] = "myString"

obj = constructor.newInstance(new_instance_args)
assert obj.getClass().getName() == 'java.lang.StringBuffer'
assert obj.toString() == 'myString'

The example could be simplified by using a helper function for converting Python lists to Java arrays, such as the toJArary helper used in PySpark internals.

Thanks a lot, It works.