JuulLabs / kable

Kotlin Asynchronous Bluetooth Low-Energy

Home Page:https://juullabs.github.io/kable

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Crash logs from Android on cancellation

rocketraman opened this issue · comments

I am getting crash logs from Android due to cancellation.

This is the stack I see:

 va.e0
	at com.juul.kable.BluetoothDeviceAndroidPeripheral$read$$inlined$execute$1.invokeSuspend(Connection.kt:21)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:8)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:107)
	at android.os.Handler.handleCallback(Handler.java:883)
	at android.os.Handler.dispatchMessage(Handler.java:100)
	at android.os.Looper.loop(Looper.java:214)
	at android.os.HandlerThread.run(HandlerThread.java:67)
	Suppressed: kotlinx.coroutines.m0: [v1{Cancelling}@5439012, Dispatchers.Main]

The va.e0 appears to be obfuscated?
Would it be possible to get that part de-obfuscated?

Based on the stacktrace, I assume this is occurring on cancellation of a read operation?

Would it be possible to provide the snippet of calling code (or a small reproducer)?

What version of Kable is this on?

The va.e0 appears to be obfuscated?

Retrace doesn't appear to deobfuscate it. Not sure why.

Based on the stacktrace, I assume this is occurring on cancellation of a read operation?

It seems that way, yes.

Would it be possible to provide the snippet of calling code (or a small reproducer)?

      val observation = p.observe(smaWeightMeasurementCharacteristicNotify)
      observation
        .onEach { bytes ->
          logger.debug { "Peripheral $nameOrIdentifier read  sma: data:\n${HexDumpUtils.dump(bytes, 0, bytes.size, true)}" }
        }
        .collect { bytes ->
          if (bytes.isNotEmpty()) {
            receivedDataChannel.trySend(bytes).also {
              it.onClosed { logger.trace { "=> read: channel closed: ${it?.message}" } }
              it.onFailure { logger.trace { "=> read: channel send failure: ${it?.message}" } }
              it.onSuccess { logger.trace { "=> read: channel send success" } }
            }
          }
        }

I'm going to add a catch operator in there to see if cancellation exceptions are being propagated up.

What version of Kable is this on?

Version 0.24.0.

I'm not sure if the stacktrace is misleading, but the first line of it points to Connection.kt:21:

public class OutOfOrderGattCallbackException internal constructor(
message: String,
) : IllegalStateException(message)

I normally wouldn't expect this exception to be thrown on cancellation:

// `lock` should always enforce a 1:1 matching of request to response, but if an Android `BluetoothGattCallback`
// method gets called out of order then we'll cast to the wrong response type.
response as? T
?: throw OutOfOrderGattCallbackException(
"Unexpected response type ${response.javaClass.simpleName} received",
)

Instead I might expect it to be thrown on a subsequent I/O operation after a previous I/O cancellation (if there is a bug in Kable handling of the cancellation).

Are there specific steps you're doing to reproduce it (or, more specifically, how is the cancellation occurring)? Is there surrounding code around the code snippet you shared that performs the cancellation?

Are there specific steps you're doing to reproduce it (or, more specifically, how is the cancellation occurring)? Is there surrounding code around the code snippet you shared that performs the cancellation?

I'm actually not sure how to reproduce it. Its an error I see periodically pop up in Bugfender which we use to track errors on production devices.

I don't do any cancellation of the read/write scopes myself. I do cancellation of the peripheral scanning scope in certain cases, and I call the peripheral.disconnect() method if requested by the user.

Sorry, in #544 (comment) it isn't clear to me what cancellation is occurring that is causing the crash?

Are you thinking that a parent coroutine cancelled and it is propagating the cancelation at the mentioned call site?

If possible, it would be great to get more info, as I'm not sure how to debug this further, unfortunately.

Well, I'm not sure either. I think I have some wonky code related to peripheral state tracking and connection which may be the cause of the issue. I'll try to refactor a bit to see if the problem goes away, or if the problem continues to happen, I can get a clearer idea of where the cancellation is coming from.

There is a large internal refactor going into 0.28.0, I don't think it will resolve this issue, but I'm curious if the stacktrace changes as a result of some of the changes going in. If it changes, it might help provide additional hints as to what is going on/wrong here.

@rocketraman when you have a chance, can you test 0.28.0-rc and let me know if the stacktrace has changed for this issue?

I actually haven't seen this happen in a while, so I suspect the fix of the wonky code I mentioned before resolved it. I'll go ahead and close. Thanks!

Great to hear. Thanks for following up. 👍