When BLE connection failed in android, application getting runtime abort due to pending ClassNotFoundException on "io.github.gedgygedgy.rust.future.FutureException"
ulankoti opened this issue Β· comments
Describe the bug
When BLE connection is failed in android, runtime abort of application is observed due to pending class not found exception.
The exception is java.lang.ClassNotFoundException and class it is unable to find is "io.github.gedgygedgy.rust.future.FutureException".
Wrote an android application which will load rust native library and print the services and characteristics of nearby BLE peripheral devices. In the Rust library, I have used btleplug/examples/discover_adapters_peripherals.rs and called the function from JNI method.
Expected behavior
When BLE connection is timedout or rejected, error should be printed and continued to prepare connection request to next peripheral in the scanned list. The android application should not be stopped.
Actual behavior
But the application is getting runtime abort due to pending ClassNotFoundException on "io.github.gedgygedgy.rust.future.FutureException". But that class is part of dependent java library to droidplug-debug.aar library and droidplug-debug.aar library is added as build time dependency to the android application.
No issue observed when BLE connection is successful.
Additional context
logcat log is attached and apk also attached for reference.
snippet of the crash signature is pasted below.
2023-06-11 10:27:46.853 21775-21798 BluetoothGatt com.example.btleplugex D onClientConnectionState() - status=133 clientIf=9 device=DA:7C:35:35:F2:DF
2023-06-11 10:27:46.857 21775-21807 btleplugex-jni com.example.btleplugex E btleplugex::discover_adapters_peripherals: Error connecting to peripheral, skipping: Other(JavaException)
2023-06-11 10:27:46.858 21775-21807 mple.btleplugex com.example.btleplugex A java_vm_ext.cc:594] JNI DETECTED ERROR IN APPLICATION: JNI GetMethodID called with pending exception java.lang.ClassNotFoundException: Didn't find class "io.github.gedgygedgy.rust.future.FutureException" on path: DexPathList[[directory "."],nativeLibraryDirectories=[/system/lib64, /system_ext/lib64, /system/lib64, /system_ext/lib64]]
java_vm_ext.cc:594] at java.lang.Class dalvik.system.BaseDexClassLoader.findClass(java.lang.String) (BaseDexClassLoader.java:259)
java_vm_ext.cc:594] at java.lang.Class java.lang.ClassLoader.loadClass(java.lang.String, boolean) (ClassLoader.java:379)
java_vm_ext.cc:594] at java.lang.Class java.lang.ClassLoader.loadClass(java.lang.String) (ClassLoader.java:312)
java_vm_ext.cc:594]
java_vm_ext.cc:594] in call to GetMethodID
2023-06-11 10:27:46.985 21775-21807 mple.btleplugex com.example.btleplugex A runtime.cc:682] Runtime aborting...
runtime.cc:682] Dumping all threads without mutator lock held
.....
source code hosted at https://github.com/ulankoti/btleplugex
Hi @qdot,
can you help check the issue reported.
@ulankoti
As far as I can see in log.... you main problem is probably in java :
java_vm_ext.cc:594] JNI DETECTED ERROR IN APPLICATION: JNI GetMethodID called with pending exception java.lang.ClassNotFoundException: Didn't find class "io.github.gedgygedgy.rust.future.FutureException"
I couldn't find that class inside APK.
Hi @blandger
classes.dex inside the apk has FutureException class. ClassNotFoundException is seen only when connection is failed.
attached parsed apk image for your reference.
According my big java experience, that looks like JVM issue. Yes, probably it happens on error while 'discover_adapters_peripherals'. As I can see JVM class loader issue. That is first thing comes to my mind. DexPathList[[directory "."],nativeLibraryDirectories=[/system/lib64, /system_ext/lib64, /system/lib64, /system_ext/lib64]] looks strange.
.............................
2023-06-11 10:27:46.853 21775-21798 BluetoothGatt com.example.btleplugex D onClientConnectionState() - status=133 clientIf=9 device=DA:7C:35:35:F2:DF
2023-06-11 10:27:46.857 21775-21807 btleplugex-jni com.example.btleplugex E btleplugex::discover_adapters_peripherals: Error connecting to peripheral, skipping: Other(JavaException)
2023-06-11 10:27:46.858 21775-21807 mple.btleplugex com.example.btleplugex A java_vm_ext.cc:594] JNI DETECTED ERROR IN APPLICATION: JNI GetMethodID called with pending exception java.lang.ClassNotFoundException: Didn't find class "io.github.gedgygedgy.rust.future.FutureException" on path: DexPathList[[directory "."],nativeLibraryDirectories=[/system/lib64, /system_ext/lib64, /system/lib64, /system_ext/lib64]]
java_vm_ext.cc:594] at java.lang.Class dalvik.system.BaseDexClassLoader.findClass(java.lang.String) (BaseDexClassLoader.java:259)
java_vm_ext.cc:594] at java.lang.Class java.lang.ClassLoader.loadClass(java.lang.String, boolean) (ClassLoader.java:379)
java_vm_ext.cc:594] at java.lang.Class java.lang.ClassLoader.loadClass(java.lang.String) (ClassLoader.java:312)
java_vm_ext.cc:594]
java_vm_ext.cc:594] in call to GetMethodID
2023-06-11 10:27:46.985 21775-21807 mple.btleplugex com.example.btleplugex A runtime.cc:682] Runtime aborting...
.................................
The rest of log seems just a thread dumps.
Hi @blandger
Those classes are not registered to the Java VM. Instead, jni-utils-rs has classcache which substitutes the needed calls.
Looks like commenting below portion in jni-utils-rs resolved the crash.
Could you please help check what is the need to re-throw the caught and cleared exception ?
repo: https://github.com/deviceplug/jni-utils-rs.git
branch: master
jni-utils-rs$ git diff rust/exceptions.rs
diff --git a/rust/exceptions.rs b/rust/exceptions.rs
index 468365b..54aeae0 100644
--- a/rust/exceptions.rs
+++ b/rust/exceptions.rs
@@ -86,10 +86,10 @@ impl<'a: 'b, 'b, T> TryCatchResult<'a, 'b, T> {
let ex = env.exception_occurred()?;
let _auto_local = env.auto_local(ex.clone());
env.exception_clear()?;
- if env.is_instance_of(ex, class)? {
- return block(ex).map(|o| Some(o));
- }
- env.throw(ex)?;
+// if env.is_instance_of(ex, class)? {
+// return block(ex).map(|o| Some(o));
+// }
+// env.throw(ex)?;
}
Ok(None)
})()
Looks like commenting below portion in jni-utils-rs resolved the crash. Could you please help check what is the need to re-throw the caught and cleared exception ?
It's quite hard for me to explan that unknow rust code...sorry.
i just opened a pull request that fixes this bug, but we still have some problems when the exception is thrown, cause it's never cleared on the JNI env and we get another exception like this one https://gist.github.com/trik/1f219645e6dee2440dd2f7417b817388
as a workaround i tried to clear manually when the peripheral object is build from env
trik@732716a#diff-1689a2ca69ba3382818af564e18f1d645e271736ce3b365e8414a3d266723157R117
but it's just a dirty workaround, cause if the exception is thrown and not cleared by the with_obj method and we call another method on the peripheral (eg. stop_scan), we still get the same error
i think that for some reason this part is never executed
https://github.com/deviceplug/jni-utils-rs/blob/2a585382865e151c7aba0e0e8685fbb0d0fdff50/rust/exceptions.rs#L85
debugging rust on android it's a real pain, will try to investigate further
Hi @trik
I think your change will clear all exceptions and application will lose the capability to catch the special kind of exceptions to take respective actions.
Resolved the problem at the moment but still stability issues observed with other exceptions from Java and Android BT Framework. After thorough validation, a pull request will be created for peer review.
Looks like btleplug android changes are validated for positive cases, but negative flow causing exceptions making the application to crash at runtime.
Looks like btleplug android changes are validated for positive cases, but negative flow causing exceptions making the application to crash at runtime.
I'm quite agree with you. That were one of my first thougts looking on rust code with java exceptions handling.
Hi @ulankoti,
i totally agree, that's why I said it was a dirty workaround, I needed just to make some test on a real device for the other pull I opened for the descriptors.
The point is that the Java Future exception should be caught by the jni utils, that should clear the exception the jni env and rethrow the rust exception but for some reason this does not happen. The env clear happens on the JavaException match https://github.com/deviceplug/jni-utils-rs/blob/2a585382865e151c7aba0e0e8685fbb0d0fdff50/rust/exceptions.rs#L82
so probably we are not there
@ulankoti @blandger
I did find the culprit
btleplug/src/droidplug/peripheral.rs
Line 66 in b478997
here the exception is rethrown on the jni environment.
The solution should be to handle all the other exception children of the generic BluetoothException and return a custom error for each of them.
Imho the exception should not be rethrown on the jni environment, also when the generic JavaError is returned, otherwise any subsequent call on the same env will fail
@trik
Original intention might be after handling all the known FutureExceptions at rust/jni layers, remaining other or unknown exceptions are re-thrown to be handled at java layer.
Also, observed NoSuchElementException while testing the BLE connections back to back.
jni_utils::task: JPollResult::get
jni::wrapper::jnienv: exception found, returning error
java_vm_ext.cc:579] JNI DETECTED ERROR IN APPLICATION: JNI GetMethodID called with pending exception java.util.NoSuchElementException:
java_vm_ext.cc:579] at java.lang.Object java.util.LinkedList.removeFirst() (LinkedList.java:270)
java_vm_ext.cc:579] at java.lang.Object java.util.LinkedList.remove() (LinkedList.java:685)
java_vm_ext.cc:579] at java.lang.Object io.github.gedgygedgy.rust.stream.QueueStream.lambda$pollNext$0$io-github-gedgygedgy-rust-stream-QueueStream() (QueueStream.java:31)
java_vm_ext.cc:579] at java.lang.Object io.github.gedgygedgy.rust.stream.QueueStream$$ExternalSyntheticLambda4.get() (D8$$SyntheticClass:-1)
java_vm_ext.cc:579]
java_vm_ext.cc:579] in call to GetMethodID
Also, observed NoSuchElementException while testing the BLE connections back to back.
Check if list is empty before remove.
This one is thrown by jni-utils, apparently the list is checked before the remove
https://github.com/deviceplug/jni-utils-rs/blob/2a585382865e151c7aba0e0e8685fbb0d0fdff50/java/src/main/java/io/github/gedgygedgy/rust/stream/QueueStream.java#L30
You got this exception during the connection or the service discovery?
while iterating on the notifications after subscribing to notify characteristic.
synchronized (this.lock) {
if (!this.result.isEmpty()) {
result = () -> () -> this.result.remove();
looks like lamda on another lambda while removing the item. how to understand this ?
in add() synchronized block is not used. Could it be the reason for such crash ?
looks like lamda on another lambda while removing the item. how to understand this ?
probably the item should be removed inside the synchronize block and returned by the lambda
Just a heads up on development background for this:
jni-utils-rs and most of the android core was handed to me by an anonymous contributor who I've not been able to contact in about 2 years now. While they work, and I kind of understand what's going on in them, very little if any of it was developed by the lead devs here. So it's a bit difficult to reason on why things are the way they are in order to reply to these bugs.
That said, I'm definitely open to taking patches.
@qdot I'll make some tests and open a pr on jni-utils-rs if I can find a solution
Found another crash signature
got this when the peripheral device is powered down while subscribing to notify characteristic.
2023-06-19 23:39:02.928 25045-25706 btleplugex-jni com.example.btleplugex I btleplugex::discover_adapters_peripherals: Subscribing to characteristic Characteristic { uuid: 00002a37-0000-1000-8000-00805f9b34fb, service_uuid: 0000180d-0000-1000-8000-00805f9b34fb, properties: NOTIFY } 2023-06-19 23:39:07.930 25045-25706 BluetoothGatt com.example.btleplugex D setCharacteristicNotification() - uuid: 00002a37-0000-1000-8000-00805f9b34fb enable: true 2023-06-19 23:39:07.940 25045-25706 System.out com.example.btleplugex I Uma->setCommandCallback() callback: com.nonpolynomial.btleplug.android.impl.Peripheral$6@96b95c1 2023-06-19 23:39:09.839 25045-25062 System.out com.example.btleplugex I Uma->wakeWithThrowable(), result: java.lang.RuntimeException: Unable to write descriptor 2023-06-19 23:39:09.841 25045-25065 BluetoothGatt com.example.btleplugex D onClientConnectionState() - status=0 clientIf=15 device=FB:3C:36:23:FA:AE 2023-06-19 23:39:09.843 25045-25706 btleplugex-jni com.example.btleplugex D jni::wrapper::jnienv: exception found, returning error 2023-06-19 23:39:09.843 25045-25706 btleplugex-jni com.example.btleplugex D jni_utils::exceptions: Uma-> TryCatchResult catch 2023-06-19 23:39:09.843 25045-25706 btleplugex-jni com.example.btleplugex D jni_utils::exceptions: Uma-> TryCatchResult step 4, received Ok(Err(Error::JavaException)) type 2023-06-19 23:39:09.843 25045-25706 btleplugex-jni com.example.btleplugex D btleplug::droidplug::peripheral: Uma->unknown exception, re-throwing 2023-06-19 23:39:09.844 25045-25706 btleplugex-jni com.example.btleplugex E btleplugex: discover() returned error: Other(JavaException) --------- beginning of crash 2023-06-19 23:39:09.844 25045-25706 btleplugex-jni com.example.btleplugex D btleplugex: exiting discover thread: ThreadId(30) 2023-06-19 23:39:09.846 25045-25706 AndroidRuntime com.example.btleplugex E FATAL EXCEPTION: Thread-22 Process: com.example.btleplugex, PID: 25045 io.github.gedgygedgy.rust.future.FutureException: java.lang.**RuntimeException**: Unable to write descriptor at io.github.gedgygedgy.rust.future.SimpleFuture.lambda$wakeWithThrowable$1(SimpleFuture.java:76) at io.github.gedgygedgy.rust.future.SimpleFuture$$ExternalSyntheticLambda1.get(Unknown Source:2) Caused by: java.lang.RuntimeException: Unable to write descriptor at com.nonpolynomial.btleplug.android.impl.Peripheral$6.lambda$onDescriptorWrite$0$com-nonpolynomial-btleplug-android-impl-Peripheral$6(Peripheral.java:246) at com.nonpolynomial.btleplug.android.impl.Peripheral$6$$ExternalSyntheticLambda0.run(Unknown Source:10) at com.nonpolynomial.btleplug.android.impl.Peripheral.asyncWithFuture(Peripheral.java:324) at com.nonpolynomial.btleplug.android.impl.Peripheral.access$700(Peripheral.java:24) at com.nonpolynomial.btleplug.android.impl.Peripheral$6.onDescriptorWrite(Peripheral.java:244) at com.nonpolynomial.btleplug.android.impl.Peripheral$Callback.onDescriptorWrite(Peripheral.java:408) at android.bluetooth.BluetoothGatt$1$10.run(BluetoothGatt.java:636) at android.bluetooth.BluetoothGatt.runOrQueueCallback(BluetoothGatt.java:864) at android.bluetooth.BluetoothGatt.-$$Nest$mrunOrQueueCallback(Unknown Source:0) at android.bluetooth.BluetoothGatt$1.onDescriptorWrite(BluetoothGatt.java:631) at android.bluetooth.IBluetoothGattCallback$Stub.onTransact(IBluetoothGattCallback.java:234) at android.os.Binder.execTransactInternal(Binder.java:1285) at android.os.Binder.execTransact(Binder.java:1244)
Another crash in bluetoothgatt client, but application remained running. But bluetooth scanning and further ops aren't working after observing this crash.
2023-06-19 11:58:19.416 30688-30708 BluetoothGatt com.example.btleplugex D onClientConnectionState() - status=8 clientIf=12 device=FB:3C:36:23:FA:AE 2023-06-19 11:58:19.448 30688-30708 BluetoothGatt com.example.btleplugex W Unhandled exception in callback com.nonpolynomial.btleplug.android.impl.UnexpectedCallbackException at com.nonpolynomial.btleplug.android.impl.Peripheral$CommandCallback.onConnectionStateChange(Peripheral.java:402) at com.nonpolynomial.btleplug.android.impl.Peripheral$Callback.onConnectionStateChange(Peripheral.java:333) at android.bluetooth.BluetoothGatt$1$4.run(BluetoothGatt.java:272) at android.bluetooth.BluetoothGatt.runOrQueueCallback(BluetoothGatt.java:780) at android.bluetooth.BluetoothGatt.access$200(BluetoothGatt.java:41) at android.bluetooth.BluetoothGatt$1.onClientConnectionState(BluetoothGatt.java:267) at android.bluetooth.IBluetoothGattCallback$Stub.onTransact(IBluetoothGattCallback.java:192) at android.os.Binder.execTransactInternal(Binder.java:1021) at android.os.Binder.execTransact(Binder.java:994)
This could be due to gatt client not closed after the disconnect and possibility of hitting https://github.com/deviceplug/btleplug/blob/0.10.5/src/droidplug/java/src/main/java/com/nonpolynomial/btleplug/android/impl/Peripheral.java#L65 without setting command callback.
after back to back connect/disconnect in same thread or different thread sequentially, found the connect/disconnect sequence broken after 23 iterations on pixel 4a mobile. Resolved this after calling close() in disconnected callback.
Also, observed NoSuchElementException while testing the BLE connections back to back.
Check if list is empty before remove.
@qdot @blandger
Added synchronized block while adding item into QueueStream.
Please review PR at jni-utils-rs repo
deviceplug/jni-utils-rs#3
@ulankoti it does make sense, it's more or less the same solution i was working on. will test and get back with some comments..
in the meantime, i was also working on a refactoring of all the droidplug module. relying on an external library which is not maintained anymore could not be a good thing. i'm trying to use j4rs and the first results look promising. it's less tricky to integrate with flutter or similar (i'm using tauri for my tests), since there's no need to set the java thread context class loader when spawning tokio threads, has a simpler api, supports rust to java async calls using CompletableFutures with significant performance benefits compared to the polling strategy and last but not least has an active community
in the meantime, i was also working on a refactoring of all the droidplug module
Oh that's fantastic to hear! Definitely excited to see how this turns out. :D
@qdot if you have the time time, this is the branch i'm currently working on https://github.com/trik/btleplug/tree/j4rs
src/droidplug contains the current code that i'm using just for reference,
src/droidplugnext contains the new implementation
until now, the only feature available is the scan (start/stop with basic properties discovery, no characteristics)
you can build the android support library with cargo build --features android-support-library
which generates both the debug and release aar on the manifest directory
on the java side, we just need to add in the gradle script these dependencies
implementation("io.github.astonbitecode:j4rs:0.16.1")
implementation(files("PATH_TO_SUPPORT_AAR"))
inside the android section we need to add this
packaging {
resources.excludes.add("META-INF/INDEX.LIST")
}
cause j4rs library and it's transitive depencies are badly packeged and contain multiple meta files
on the rust side, we only need to use the droidplug_init
macro exported by the new module that generates the JNI_OnLoad function with all the necessary j4rs/btleplug initialization
Ok, so we've got competing PR's happening here, between #315 and #318. I brought in #315 to our dev branch having not realized there was another on top of that. It looks like #315 was a small subset of the fixes in #318, so I've got two options here: have #318 rebased by @ulankoti to deal with #315 already being in dev, or back out #315 and just bring in #318. Since everyone involved is on this thread and I am mostly playing dumb repo manager right now, I'll let y'all hash this out and tell me what to do.
mine was just a partial fix, @ulankoti did go further fixing other problems. i had no time to test his fixes but i reviewed the code and it looks good
Gonna try to kick out a new version in the next few days with this in it.
btleplug v0.11.0 and jni-utils 0.1.1 are now live. I ran quick tests on windows and android (though didn't try anything involving the new descriptor stuff, this was just smoke testing what I currently use things for), everything seems happy.