SDK 5.1.0's new CTWorkManager breaks custom WorkerFactory implementations and can't rollback
henryglendening-hh opened this issue · comments
Describe the bug
Our app uses a custom WorkerFactory for creating Workers, and this custom WorkerFactory is assigned to the WorkerManager. I believe the new code added to SDK version 5.1.0 breaks this (specifically line 37 of CTWorkerManager.kt
) however, as our custom WorkerFactory does not know how to handle the com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
class and fails to create the corresponding worker, resulting in a crash.
Since this is a breaking change, it would be helpful for the SDK version number to reflect that and include notes indicating that the implementing app should use a DelegatingWorkerFactory to mitigate any friction with CleverTap's use of the WorkerManager.
Additionally, due to the crashes that our application has experienced in production relating to com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
, we tried to roll back our CleverTap SDK version to 5.0.0 (where this class had not yet existed, nor the CTWorkerManager.kt
class). Surprisingly, the app is continuing to crash, and experiencing the same exception where it is unable to handle com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
.
Do you know why this crash would continue to happen for users who are running SDK version 5.0.0? Is it possible that the com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
worker instruction being dispatched dynamically/remotely, affecting users who were previously running 5.1.0 but are now on 5.0.0?
To Reproduce
Steps to reproduce the behavior:
- Set up a custom WorkerFactory for use with the WorkerManager in an app
- Use CleverTap SDK 5.1.0
- ...
- Experience crash from the custom WorkerFactory not knowing how to handle
com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
Worker creation - Downgrade CleverTap SDK to 5.0.0
- ...
- Continue seeing the same crash occur, despite
com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
being absent from the app binary
Expected behavior
When the SDK version is rolled back from 5.1.0 to 5.0.0, then the WorkerManager no longer receives requests to create a Worker for com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
Screenshots/Logs
If applicable, add screenshots to help explain your problem.
In case of crashes, share the entire crash logs as a .txt
file
Environment (please complete the following information):
- Device: any
- OS: Android 10, 11, 12, 13
- CleverTap SDK Version 5.1.0 -> 5.0.0
- Android Studio Giraffe | 2022.3.1
Additional context
Add any other context about the problem here.
@henryglendening-hh What is the crash? can you attach full stacktrace?
@piyush-kukadiya The crash is due to a custom exception thrown from our custom WorkerFactory implementation. Our WorkerFactory's override of createWorker(appContext: Context, workerClassName: String, workerParameters: WorkerParameters)
throws an exception because it unexpectedly receives com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork
as the value for the workerClassName
parameter.
Here's the stacktrace:
Fatal Exception: java.lang.IllegalArgumentException: unknown worker class name: com.clevertap.android.sdk.pushnotification.work.CTFlushPushImpressionsWork at com.hayhouse.hayhouseaudio.workers.factory.MainWorkerFactory.createWorker(MainWorkerFactory.kt:24) at androidx.work.WorkerFactory.createWorkerWithDefaultFallback(WorkerFactory.java:83) at androidx.work.impl.WorkerWrapper.runWorker(WorkerWrapper.java:243) at androidx.work.impl.WorkerWrapper.run(WorkerWrapper.java:145) at androidx.work.impl.utils.SerialExecutorImpl$Task.run(SerialExecutorImpl.java:96) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) at java.lang.Thread.run(Thread.java:923)
@henryglendening-hh Thanks for pointing out. We will add this in our changelog.
it would be helpful for the SDK version number to reflect that and include notes indicating that the implementing app should use a DelegatingWorkerFactory to mitigate any friction with CleverTap's use of the WorkerManager.
I would like to know why instead of handling CTFlushPushImpressionsWork
in your custom factory you instead decided to roll back CleverTap SDK version to 5.0.0
@piyush-kukadiya We downgraded the SDK in order to lessen the impact of the crash on our users while we developed & validated a patch. Why would it be a problem to downgrade an SDK?
@henryglendening-hh While not entirely certain, it appears that the Work Manager library functions by saving scheduled workers in its own database. This database includes the name of the worker, which must be provided to the WorkerFactory for executing the task. I assumes that a CleverTap worker was scheduled and recorded in the database when the application was equipped with CleverTap SDK version 5.1.0
. Upon downgrading the SDK, the entry in the database persisted, even though the SDK version changed, ultimately leading to a crash.
@piyush-kukadiya Thank you for the explanation. From what I can tell, I'm not sure that this could be happening this way:
the Work Manager library functions by saving scheduled workers in its own database
I don't believe this step would be occurring, as the overridden onCreate
function of the custom WorkerManager fails to create the requested worker, which should mean that the worker couldn't be scheduled since it failed to get created.
@henryglendening-hh I don't find any other cause apart from this. From where else can this WorkerName will be coming from then? To verify this you can check an entry in WorkName
table in androidx.work.workdb
if you are able to reproduce the crash. Attaching the screenshot for the ref.
For the permanent solution I would recommend to handle the CTFlushPushImpressionsWork
in your factory.
@piyush-kukadiya Thank you and yes, that is the permanent solution that we are targeting otherwise 👍 As far as reproducing goes, I'm not sure how to do that - can you provide us any insight into how to achieve this (specifically, how the work for CTFlushPushImpressionsWork
gets triggered)?
@piyush-kukadiya And as far as handling CTFlushPushImpressionsWork
in our factory goes, what do you recommend? I'm currently implementing the recommendation from an Android engineer at Google where a DelegatingWorkerFactory is used and my custom WorkerFactory returns null for the CTFlushPushImpressionsWork
(thus, allowing for the default WorkFactory to handle creation of that worker implicitly).
@henryglendening-hh We have update our Docs here
CTFlushPushImpressionsWork
get's scheduled only when,- If scheduling is success, it will get stored in Work DB to run it when below constraints are met
- When device is charging and
- Network on device is connected, check here
- Android docs for
WorkerFactory
says "The factory is invoked every time a work runs", here we can assume that you received name ofCTFlushPushImpressionsWork
in Factory because work is triggered. check here - When scheduled worker is run based on constraints, it initialises
Worker
class here which will call yourcreateWorker()
of customWorkFactory
here and if yourcreateWorker()
returns null then it initialises worker class using reflection, which is default factory behaviour here. - What is happening is instead of returning null you are throwing exception. If you return null then work library will gracefully handle worker instantiation through reflection which is not happening here.
- Your current implementation will crash for all third party dependencies who are equipped with Workers and unknown to you.
- Ideal solution is to return null for unknown workers. Please check updated docs which provides recommended solution.
@piyush-kukadiya Thank you! I haven't been able to reproduce triggering the CTFlushPushImpressionsWork
worker, however my patch seems to be successful and eliminated the crash. The solution we have in place using a DelegatingWorkerFactory is also performing what you mention in point 7 and the docs, so we've got our bases covered. Thank you for your support triaging and working through this 🙂