gpu / JOCL

Java bindings for OpenCL

Home Page:http://www.jocl.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sizeof broken after android-related merges

blueberry opened this issue · comments

I have just built and installed JOCL on Arch Linux and discovered regressions with the newest android-related changes in Sizeof, when used in linux.

JOCL builds and tests fine, but when I tried it with a well testet ClojureCL, the built failed, with a messages related to the newly introduced Sizeof native method:

java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I

This happens with the latest commit 18d9a04

I did a checkout of last week's commit 862f2ab: JOCL gets built, AND the existing projects work.

Thanks for this hint. This was a bit subtle: It only happened when the Sizeof class was used before the CL class was loaded, because only the CL class loaded the native library.

This should now be fixed, and covered with a new (fairly trivial...) test case. For me, the test threw the same error before, and now it passes, so hopefully, it resolved the issue for you as well.

Unfortunately, after cleaning everything and rebuilding, the error is still there (yes, I deleted JOCL and JOCL.build and did everything from the scratch):

Caused by: java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I
at org.jocl.Sizeof.computePointerSizeNative(Native Method)
at org.jocl.Sizeof.computePointerSize(Sizeof.java:362)
at org.jocl.Sizeof.(Sizeof.java:299)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at clojure.lang.RT.classForName(RT.java:2168)

To clarify: your new test in JOCL passes, but the tests in my library that uses JOCL still have the Sizeof load problem.

OK, that's odd - and it's distressing that I can't reproduce this right now. Is there any test case that might expose the same behavior (on windows?).

I wonder whether this is related to some classloader magic in Clojure, but this would not explain why loading the library works for the main CL class, but not for the Sizeof class.

In fact, without the possibility to analyze this further, I don't see what I could to right now, except for undoing the change. If until tomorrow evening nobody (also @phlip9, referring to the discussion in #4 (comment) ) has another idea, I'd rather go back to use the sun.arch.data.model approach (without the native method), so that it at least works reliably on desktop systems.

Thanks for looking at this. Unfortenately, I can't provide a java test case, since I am using it from Clojure library that has many methods, so I can't identify why Sizeof fails.

What I discovered is not much: CL calls work, Sizeof importing (interactive, from repl) works, but if i try to use Sizeof directly, even AFTER CL has been loaded and called, I get that exception.

@gpu I actually just ran into this issue while testing on an Android device. Without calling something in CL first, the native library doesn't load and Sizeof.calculatePointerSizeNative() fails with an UnsatisfiedLinkError. While I haven't tested the most recent commit, I don't immediately see why it's still failing. If anything, just calling CL.loadNativeLibrary() should force the class loader to load CL and run its static block. Strange...

It doesn't help (at least in my case) but maybe a solution would be to move the calculatePointerSizeNative to the CL class? May be the problem is that loadNativeLibrary is called twice?

It seems to work with the latest changes... I added a few print statements to see the control flow:

I/System.out(30313): Begin Sizeof static block
I/System.out(30313): Begin CL static block
I/System.out(30313): Begin CL.loadNativeLibrary()
I/System.out(30313): nativeLibraryLoaded = false
I/System.out(30313): End CL.loadNativeLibrary()
I/System.out(30313): End CL static block
I/System.out(30313): Begin CL.loadNativeLibrary()
I/System.out(30313): nativeLibraryLoaded = true
I/System.out(30313): End CL.loadNativeLibrary()
I/System.out(30313): End Sizeof static block
I/System.out(30313): Sizeof.computePointerSize()
I/System.out(30313): Sizeof.POINTER = 4

As I suspected, calling CL.loadNativeLibrary() from Sizeof triggers the class loader to run CL's static block first, then finishes executing Sizeof's static block. The code still works, however.

This is also running through Scala on Android, which may or may not affect the result.

If we move Sizeof.calculatePointerSizeNative() I expect this issue to resolve, as we will then isolate loading the native library to a single class.

I tried various changes with how CL and Sizeof are called, as well as tried changing CL and Sizeof classes themselves and the problem is still there. It seems to me, as gpu suggested, that the problem is in exotic class loading.

@gpu @phlip9 This works for me:

private static int computePointerSize()
        {
            if (LibUtils.calculateOS() == LibUtils.OSType.ANDROID) {
                return computePointerSizeNative();
            } else {
                String bits = System.getProperty("sun.arch.data.model");
                if (bits.equals("32"))
                {
                    return 4;
                }
                else if (bits.equals("64"))
                {
                    return 8;
                }
                else
                {
                    System.err.println(
                        "Unknown value for sun.arch.data.model - assuming 32 bits");
                    return 4;
                }
            }

    }

Yes, I also considered making the native method a package-private method in the CL class. But I usually try to avoid tackling a problem when I do not know what causes the problem.

On the one hand, it seemed clear (in hindsight) : When using Sizeof without having used CL, then the native library was not loaded. That definitely was a problem, and I could reproduce this, and fix it with the commit linked above. And, as Dragan said, this "works", in the sense that the unit test passes.

But it's entirely unclear to me why it does not work in the Clojure setup. Introducing some of the System.out logs, as Philip suggested, in the static initializer blocks (of both classes) and in the CL.loadNativeLibrary method might bring some insights. The stack trace until now shows

at org.jocl.Sizeof.(Sizeof.java:299)
at java.lang.Class.forName0(Native Method)

which means that during class loading, it seems to do the initialization of the POINTER size by calling the native method, and this fails. But according to http://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.4.2 , I think that the static initializers should have run before this initialization takes place.

So @phlip9 if the solution proposed by Dragan works on Android as well, then this may be a feasible workaround, and I'd add it later today to at least have a working state for both worlds.

But I'd really like to know what went wrong there under the hood. @blueberry It would be great if you could add some basic System.outs, as described above, just to get a hint about whether or not (and when) the static initializer is running.

I've already tried that, but did not find any hint:

Compiling uncomplicate.clojurecl.info
======= loadNativeLibrary ENTER
======= loadNativeLibrary !nativeLibraryLoaded ENTER
======= loadNativeLibrary !nativeLibraryLoaded LEAVE
======= loadNativeLibrary LEAVE
======= loadNativeLibrary ENTER
======= loadNativeLibrary LEAVE
======= computerPointerSize ENTER
java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I, compiling:(uncomplicate/clojurecl/info.clj:296:3)
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I, compiling:(uncomplicate/clojurecl/info.clj:296:3)

OK, this trace looks like it indeed does load the library (without errors). I could only guess now that this may be related to the use of different ClassLoaders in Clojure, as each class loader has its own set of native libraries ( https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html#library_version ) and thus, the library might in fact be loaded, but from a different loader than the one that loads the Sizeof class. (Still, this doesn't explain why it works for the CL class alone...). Further debugging could involve starting the JVM with -XX:+TraceClassLoading and hope for further insights, but ... it's difficult.

I have added the solution that Dragan suggested, and which hopefully works in both setups.

However, I'm still feeling uncomfortable about this, and will definitely leave this issue open until it is clear what's going on there....

Thank you for the quick response. Please, when you are ready to release a RC to the maven central, ping me to do the linux 64-bit build (and also 32 bit if it could be done on the same system from the cmake you supply).

Thanks. As there seems to be no response from the original reporter: Did you encounter any issues during linking, related to the fPIC flag that was mentioned in #3 ?

No, JOCL works well with a fairly non-trivial Clojure code.