warmsound / crystal-face

Garmin Connect IQ watch face

Home Page:https://apps.garmin.com/en-GB/apps/9fd04d09-8c80-4c81-9257-17cfa0f0081b

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Crashes on Fenix5 after 15.10 upgrade

ctremlACR opened this issue ยท comments

Since my Fenix 5x upgraded to 15.10 today the Crystal face crashes seemingly every minute after it is loaded (causing the default watch face to swap back in). I've tested using other developer watch faces and set them up with similar data metrics and I don't see a crash there. Not sure what is happening, but Crystal is an awesome watch face and I'd love to be able to use it again.

I just want to add that the Kudos watchface that was originally based on the same code does not have the same problem. I have not worked on Garmin apps before but if no one starts working this I may have a look this weekend.

Thanks for the reports, and sorry it's taken a while to reply. If you could attach any CIQ_LOG.yml files containing the crashes, this would help me build up a picture of what is going on.

In the meantime, Brandon from the CIQ team has encountered the problem himself, and kindly provided me the following error log:

Error: Unexpected Type Error
Details: 'Failed invoking <symbol>'
Time: 2019-12-20T22:50:35Z
Part-Number: 006-B2604-00
Firmware-Version: '15.10'
Language-Code: eng
ConnectIQ-Version: 3.1.5
Store-Id: 9fd04d09-8c80-4c81-9257-17cfa0f0081b
Store-Version: 59
Filename: E3A7D938
Appname: Crystal
Stack:
  - pc: 0x100027e2

With the relevant part of the symbol map:

<entry filename="/Users/vince/Documents/Projects/crystal-face/source/CrystalView.mc" id="53" lineNum="397" pc="268445648" symbol="onPartialUpdate"/>
<entry filename="/Users/vince/Documents/Projects/crystal-face/source/CrystalView.mc" id="53" lineNum="400" pc="268445652" symbol="onPartialUpdate"/>
<entry filename="/Users/vince/Documents/Projects/crystal-face/source/CrystalView.mc" id="53" lineNum="401" pc="268445674" symbol="onPartialUpdate"/>

0x100027e2 is 268445666 in decimal. so it appears the problem is indeed on line 400 of CrystalView.mc, which is simply the following within onPartialUpdate():

mDataFields.update(dc, /* isPartialUpdate */ true);

Brandon has also stated:

Iโ€™m not entirely convinced the problem is Crystal because itโ€™s only occurring on the 5X, not the 5 or 5S, and it started happening with 15.10 firmware. The 5X has a separate display processor that introduces some complication drawing watch faces.

The original stack looks "trustworthy": I think you'd expect a single entry only, given that onPartialUpdate() is called directly by CIQ.

The likely cause of the error is that mDataFields is null. The variable is a cached reference to the DataFields drawable that is set up in cacheDrawables(), which itself is called from onLayout(), immediately after the setLayout() call.

So, by inspection, there's a couple of reasons I can see for mDataFields being null at this point:

  1. cacheDrawables() has not been called before the first onPartialUpdate() call. This means that either onLayout() has not been called by this point, or that onPartialUpdate() is called somewhere between the start of onLayout() and cacheDrawables(), possibly within the setLayout() call.
  2. cacheDrawables() was called when expected, but View.findDrawableById("DataFields") has returned null when trying to cache the mDataFields reference. This would be the case if, when setLayout() returns, the view was not yet "ready" to return drawables from findDrawableById(), and that the drawable caching would need to be done at a slightly later time.

It would be good to know which (if any) of these two scenarios is actually occurring, and whether the 15.10 firmware is responsible for a change in behaviour that triggers it. With any luck, further crash dumps might give us a clue ๐Ÿคž ๐Ÿ˜„ Also just taking a look through the 56(!) e-mails I've received in the last 48 hours to see if there's any additional info there. Many are unsurprisingly from Fenix 5X users on 15.10...

Many users are reporting that the crash happens after 10 seconds of the watch face starting up.

I'd like to consider an urgent fix to get affected users up and running ASAP, but we'll have to tread carefully.

If scenario 1 is the case (onPartialUpdate() called before cacheDrawables()), then a simple null check for mDataFields, and mTime, within onPartialUpdate() should suffice: early call(s) to onPartialUpdate() will do nothing, while later calls (after cacheDrawables() is called) will start to work automatically. This check will increase the partial update execution time slightly.

But if scenario 2 is true, then the above fix alone will just result in non-functioning partial updates. If we can assume that drawables will be available via findDrawableById() at the very latest by the start of onUpdate(), then we could check at the start of onUpdate() whether drawables have been successfully cached yet, and if not, do so immediately, before any attempts to use the cached references during the full update cycle. Such a check should not cause a significant performance loss during the full update cycle.

The fix for scenario 1 increases partial update execution time by 31ms (simulator, Fenix 5X, SDK 3.1.5), or a fraction of 1% of the total time, so negligible.

I've released 2.4.1 with the above fix - will observe whether this fixes the issue. I've added logging (enabled in release version) to cacheDrawables() that will log in case mDataFields is set to null.

I notice that function calls setHideSeconds(), which uses mTime and mDrawables[:MoveBar] cached references, but the crash is not in that function. So it's possible that no all drawable references are similarly affected - is there something special about the DataFields drawable?

The 2.4.1 fix attempt has not worked: at least one confirmed 5X user still experiencing a crash after updating. Will ask users for CIQ_LOG.yml.

If you're a Fenix 5X owner experiencing the crash, please attach your CIQ_LOG.yml file here, and I'll investigate ASAP.

@warmsound My crash from yesterday looks identical to what you had posted earlier (See below). I just built your latest code and side loaded into the device. The behavior definitely changed. It used to swap to a different face. Now the screen goes fully black and I didn't get a new entry on the CIQ_LOG.yml. I'll add some prints and see what I can find.

Error: Unexpected Type Error
Details: 'Failed invoking '
Time: 2019-12-20T10:11:45Z
Part-Number: 006-B2604-00
Firmware-Version: '15.10'
Language-Code: eng
ConnectIQ-Version: 3.1.5
Store-Id: 9fd04d09-8c80-4c81-9257-17cfa0f0081b
Store-Version: 59
Filename: 9CJB5148
Appname: Crystal
Stack:

  • pc: 0x100027e2

@warmsound FYI, your print statement is printing "cacheDrawables(): mDataFields is null" so you are definitely getting a null out of View.findDrawableById("DataFields");

Note that on the latest version switching to the next widget and back makes the face reappear but then you get null again after a few seconds and the face goes back to black.

Edit 2: The initialize routine of DataFields is getting called with the parameters specified in layout.xml but then View.findDrawableById returns null anyway.

@jeriveraf, that's really useful information - thanks for this!

So it sounds like 2.4.1 is at least not crashing, and that mDataFields really is null.

Short of a serious CIQ bug, I was struggling to see how the DataFields drawable was missing from the view, given that it's part of the layout!

One slightly far-fetched reason could be that somehow, the AlwaysOn code that I added exclusively for the Venu watch, is being activated for 5X somehow. If that's the case, Crystal might be attempting to switch to the AlwaysOn layout, which is empty for all watches except for Venu. When cacheDrawables() is called, it will then nullify all drawable references. This would be consistent with the watch face going black after a few seconds: if it were only the partial updates that were failing, I'd expect to see most of the usual watch face visible, but perhaps with the data fields (e.g. live HR) and time (seconds) failing to update; the black might be Crystal switching to an empty layout. If the Kudos watch face was forked before I introduced Venu support to Crystal, this would explain why Kudos is unaffected by the latest 5X firmware.

I've got it working but I'm trying to understand why. The reason for the screen going black has something to do with your burn in protection code (which is not in Kudos). I commented out the line that turns on burn in protection on sleep and now the face is updating as normal.

@jeriveraf, if you're able, could you add trace to determine whether the following if clause is entered in onEnterSleep():

// If watch requires burn-in protection, set flag to true when entering sleep.
var settings = Sys.getDeviceSettings();
if (settings has :requiresBurnInProtection && settings.requiresBurnInProtection) {
	mIsBurnInProtection = true;
	mBurnInProtectionChangedSinceLastDraw = true;
}

and likewise, whether this if clause is entered within onUpdate():

// If burn-in protection has changed, set layout appropriate to new burn-in protection state.
// If turning on burn-in protection, free memory for regular watch face drawables by clearing references. This means that
// any use of mDrawables cache must only occur when burn in protection is NOT active.
// If turning off burn-in protection, recache regular watch face drawables.
if (mBurnInProtectionChangedSinceLastDraw) {
	mBurnInProtectionChangedSinceLastDraw = false;
	setLayout(mIsBurnInProtection ? Rez.Layouts.AlwaysOn(dc) : Rez.Layouts.WatchFace(dc));
	cacheDrawables();
}

The answer should be "no" in both cases for 5X, but if the new firmware for some reason sets DeviceSettings.requiresBurnInProtection to true, then this would explain a lot! Incidentally, I notice Venu owners are also reporting crashes with latest firmware, so it's possible that this area has changed recently.

Sorry, crossed comments. From your previous comment, it sounds likely that requiresBurnInProtection has accidentally been set to true in the latest 5X firmware - would be great if you could confirm this. (I presume the new firmware doesn't upgrade the screen from transflective to OLED ๐Ÿ˜†)

lol. Yeah, Garmin definitely screwed up and is telling you that the watch needs burn in protection. Your code tries changing to the AlwaysOn layout which is not defined for this watch which leads to the null pointer on the DataField. I've verified all this with print statements.

You have to be kidding me ๐Ÿ˜„ OK, thanks so much for confirming this - it would have been very difficult for me to do this without the real hardware. 5X owners owe you one, and so do I!

So I'd best get a workaround in, as Brandon's out for Christmas, so I'm not sure when the CIQ team will be able to get another firmware out (I'll try and contact the team regardless). In the meantime, I could add an additional test for screen size (only Venu is 390*390px AFAIK) before dropping in to always-on mode. I should also be able to revert the "fix" I made earlier today.

Thanks again!

Yes, I reverted your fix from earlier today and as long as I have the burn in protection disabled everything works fine. Thank you for responding so quickly.

Brilliant - thanks for confirming that ๐Ÿ‘

Fenix 5X's 15.10 firmware was released just a day before Venu's 3.80 firmware, and Crystal is crashing on both. I'll try and determine if requiresBurnInProtection is set correctly for Venu. Could the flags have been "swapped" for the two firmwares?

Looks like Garmin had already raised WERETECH-8149 for the requiresBurnInProtection issue, two months ago. Issue was originally reported against 5X firmware 14.72 Beta. See here (including some strangely familiar code):
https://forums.garmin.com/developer/connect-iq/i/bug-reports/requiresburninprotection-returns-true-on-fenix-5x-14-72-beta

I just pulled your change from the store, it's working perfectly so far. Thanks for the credit on the change log.

Thanks for confirming ๐Ÿ‘And no worries - credit where credit's due.

Thanks for taking the time to do a thorough investigation. Looks like my initial hypothesis was incorrect, but I'm glad you were able to get to the bottom of it. :) I'll work with the device team to make sure a fix is put in place in a future firmware release.

Confirmed as fixed in 5X firmware 15.40 10 months ago, so the code workaround (screen size test) can now be removed, just in time for adding much overdue Venu Sq support!