s00500 / ESPUI

A simple web user interface library for ESP32 and ESP8266

Home Page:https://valencia.lbsfilm.at/midterm-presentation/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Button causes Heap-Fragmentation

JD-GG opened this issue · comments

commented

The bug
My ESP8266 crashes because of a memory issue after pressing a button on the UI.
I rewrote the functions behind the buttons about 10 times with no luck.
After trying to identify what was causing the crash, I found that the ESPUI functions are the problem (or at least part of it).
After realizing that I have Heap-Fragmentation, I removed every Arduino String out of my code. (I magically still have Heap-Fragmentation)

To Reproduce

  1. Download .ino files in this repository ESPUI-Issue193
  2. Run prepareEEPROM.ino
  3. Run async_word_clock.ino
  4. Connect to UI over WLAN at 192.168.4.1
  5. Press multiple "Aktivieren"/"Überschreiben" buttons a bunch of times (my ESP8266 crashes after about 3 activations)
  6. ESP crashes because of EXCCAUSE 3 or 28 (sometimes the watchdog just resets the ESP8266)

Expected behavior
It shouldn't crash. I don't know why it still crashes. I spent the last ~3 Months trying to find out why, but got nowhere. I don't want to give up because I already built a damn working RGB clock.

Desktop

  • OS: [Windows 10]
  • Browser: [Opera]

Additional context
The code controls the LEDs of an RGB clock. It can also save colors to EEPROM.
The button "Aktivieren" executes the presetAktivate function.
The button "Überschreiben" executes the presetOverwrite function.

Very interesting is that if you strip the presetAktivate function to the bare minimum so that it declares integers, char arrays and then only calls ESPUI functions the ESP8266 still crashes.
Like this:

void presetAktivate(Control* sender, int value)//Gets Safed Color from Label
{
    if(value == B_DOWN)
    {
      int i = static_cast<int>(sender->id - preset[0]);//Account for already existing IDs
      i /= 3;//Divide by number of IDs => Array Element

      char hue[4] = {'1'};
      char saturation[4] = {'1'};
      char brightness[4] = {'1'};
      
      ESPUI.updateControlValue(hueSliderId, hue);
      ESPUI.updateControlValue(saturationSliderId, saturation);
      ESPUI.updateControlValue(brightnessSliderId, brightness);
      
      Control* temp = ESPUI.getControl(hueSliderId);
      //hueSlider(temp, 0);//Updates SaturationSlider to the correct Color
    }
}

But if you remove any ESPUI functions, the button does not crash the ESP8266.
Like this:

void presetAktivate(Control* sender, int value)//Gets Safed Color from Label
{
    if(value == B_DOWN)
    {
      int i = static_cast<int>(sender->id - preset[0]);//Account for already existing IDs
      i /= 3;//Divide by number of IDs => Array Element

      char hue[4];
      char saturation[4];
      char brightness[4];
      getPresetEEPROM(hue,saturation,brightness,i);//Gets EEPROM values and cuts them into 3 readable strings
    }
}

Also, if you have a sketch with only the buttons the ESP8266 does not crash. It only happens in the big sketch.

I don't own a ESP32 with more memory.
It's a work in progress, so please be nice :)

Thanks so much in advance guys! Any idea how to prevent crashes in the future would mean so much to me.

I briefly looked at your code. I saw that you are calling some css functions before ESPUI.begin. I don't think this is supported.

What I mean is that ESPUI.addControl() functions are called before ESPUI.begin(). This creates the GUI control objects. Now that the controls exist you can safely call your css functions (ESPUI.setPanelStyle, ESPUI.setElementStyle, etc.).

So go back and re-arrange the functions. This might not solve the crash problem, but should help avoid future frustration.

  • Thomas

Hi all,

No Thomas, calling setPanelStyle, setElementStyle etc. before begin is fine. See the completeExample for examples of this.

My immediate thought is the implicit String generation. updateControlValue takes a reference to a String, so this:

char brightness[4] = {'1'};
ESPUI.updateControlValue(brightnessSliderId, brightness);

is implicitly creating a String object each time which can lead to fragmentation. Is this a problem? I don't know. But it is the sort of thing I avoid. I'd make them Strings, and make them static.

However, exceptions 3 and 28 are both quite fruity errors to get just because of running out of RAM, so I suspect that what you have narrowed this down to is a red herring. You need to decode the stack trace. If you haven't already, decode the stack trace and determine exactly what it is doing when the crash occurs.

I'll continue to have a skim to see if there is anything obvious, but the stack trace would be helpful.

OK I am also getting super itchy about your preset array. It might all be correct, but if there is a subtle error somewhere then all hell will break loose. You have wisely guarded your EEPROM functions by checking that i is in the right range, but not this:

ESPUI.setElementStyle(preset[i], hexArray);

You are using the preset array to get either a 0, 1, or 2 value from three buttons all using the same callback. However if the IDs from ESPUI are not contiguous then this all explodes. I presume they are, but this is not a guarantee and might not be in the future, and will not be now if you ever use ESPUI.removeControl.

So @MartinMueller2003 recently added a feature that can do this properly by associating custom values with controls. So you could pull the latest and look over that if you want.

Alternatively at I minimum I'd drop some more guards in there to check that i is always the right values. (In the future if you want to do this sort of thing, you can use a std::map. Perhaps overkill here though ;))

Again, I don't know if this is your problem, but it is at least a problem. ;)

I can tell you for sure that the Control IDs are NOT contiguous and if you see them come to you as a contiguous number array then it is an accident and should not be counted on as part of your implementation. I can also tell you that once the UI is set up, it is not allocating anything on the heap as part of a callback. That means a change in button settings will not cause the UI to use more memory or to grab and release bits of memory. However, it is possible that your callback function is doing odd things with memory and putting the memory used to hold control elements in static variable is a good suggestion.

commented

I'd make them Strings, and make them static.

Because I am using them as buffers to get values from EEPROM i can't make them static.
Also, I switched to char arrays to avoid using Strings. (Because these are strings and not Strings right?)

You need to decode the stack trace.

I can attach some if you like (stack traces in Repo). But I am not able to make something out of them. Either the ESP is doing some kind of Wifi things or it's a ESPUI function that is reallocating a String Object. I am just not good enough to understand it fully.

Alternatively at I minimum I'd drop some more guards in there to check that i is always the right values.

I did that, thank you. But after testing, this does not seem to be the Issue.

I can tell you for sure that the Control IDs are NOT contiguous

That's why I am not going to implement custom values with controls now. Not once did this approach not work for me and if it shouldn't work in the future I am going to change the code then. I am looking to complete this project in 3 days, so it's just not worth the effort right now.

it is possible that your callback function is doing odd things with memory and putting the memory used to hold control elements in static variable is a good suggestion

Also did that thanks. But the problem did not go away.
However, I am going to try to directly change the values without the use of a ESPUI function

control->value = value;
updateControl(control, clientId);

to work arround said "creation of String objects".

Because I am using them as buffers to get values from EEPROM i can't make them static.
Also, I switched to char arrays to avoid using Strings. (Because these are strings and not Strings right?)

Please dont confuse static with const. Just because a String is static, does not mean it is immutable. It just happens to never go away which means it can be used for the label of a control. And ESPUI uses "String" not "std:string" If you are worried about going back to the heap for different sizes of strings, you can always use the reserve space string feature and then you would not reallocate a string buffer unless your string got bigger than the buffer.

I can attach some if you like (stack traces in Repo). But I am not able to make something out of them. Either the ESP is doing some kind of Wifi things or its a ESPUI function that is reallocating a String Object. I am just not good enough to understand it fully.

Those are ESP8266 stack traces, they ARE pretty useless. The most likely path is a WiFi Receive operation triggers a series of network operations that end up in a netsocket callback in ESPUI that results in a callback to your code. There are a ton of places where things can go wrong. Most often the issue is a heap underrun problem (fragmentation contributes, but the heap management is pretty good). If a 2nd request arrives while the first is still processing then the ESP8266 will die. It just does not have enough ram to process more than one message at a time.

If you really want to soo if you have a heap fragmentation problem, print the current heap size (ESP.getFreeHeap()) and then output the result of ESP.getHeapFragmentation() and ESP.getMaxFreeBlockSize() If these are dropping then you have heap issues. Post the before your callback and after your callback results.

Ah Martin has already replied as I was typing. For posterity here is mine!

Because I am using them as buffers to get values from EEPROM i can't make them static.

The static storage class doesn't mean const, it means that it is not on the stack. In this example:

void myExample() {
  static String buffer(10);
}

This program has one buffer somewhere in memory. If myExample() is called twice, buffer will retain its contents because it is the same memory. If two threads called myExample() simultaneously then they will point to the same memory. So by making them static you can't call myExample() from two places at the same time (not an issue for you) but the same String object will be reused, it doesn't have to be constantly allocated and deallocated.

Also, I switched to char arrays to avoid using Strings

Yes but look at the signature of updateControlValue().

void updateControlValue(uint16_t id, const String& value, int clientId = -1);

It takes a reference to a String, but you're passing a char * so a String is being implicitly constructed and referenced.

I don't think ESP8266 stack traces are useless, because if it is your code crashing directly (or ESPUI!) then we'd see that. We don't, so therefore it does seem to be related to the internals of the network stack and so therefore related to memory overflowing or something along those lines.

commented

If you really want to soo if you have a heap fragmentation problem, print the current heap size (ESP.getFreeHeap()) and then output the result of ESP.getHeapFragmentation() and ESP.getMaxFreeBlockSize() If these are dropping then you have heap issues. Post the before your callback and after your callback results.

I am already doing that :) Look png (ESP crashed shortly after)

The static storage class doesn't mean const, it means that it is not on the stack.

Yeah, ok sorry i confused that right there. But I don't really care if it's on the stack or not. As long it's not in the heap, I don't think this is an issue for me. If my Max Free Block Size is 1768 Byte instead of ~45000 then I should rather focus on that right?

It takes a reference to a String, but you're passing a char * so a String is being implicitly constructed and referenced.

Do you think my idea in the previous comment will circumvent this issue?

But I don't really care if it's on the stack or not. As long it's not in the heap

So I am not 100% certain on all of C++'s allocation rules, but my assumption would be that the implicit String constructor that you're calling would indeed result in allocating on the heap.

Do you think my idea in the previous comment will circumvent this issue?

It would.

commented

Ok presetAktivate now looks like this:

void presetAktivate(Control* sender, int value)//DONE Gets Safed Color from Label
{
    if(value == B_DOWN)
    {
      int i = static_cast<int>(sender->id - preset[0]);//Account for already existing IDs
      i /= 3;//Divide by number of IDs => Array Element
      if(i<0||i>=presetNum){
        Serial.println(F("prAktivate i not in range!"));
        return;
      }
      static Control* huePtr = ESPUI.getControl(hueSliderId);
      static Control* saturationPtr = ESPUI.getControl(saturationSliderId);
      static Control* brightnessPtr = ESPUI.getControl(brightnessSliderId);
      
      char hue[4];
      char saturation[4];
      char brightness[4];
      getPresetEEPROM(hue,saturation,brightness,i);//Gets EEPROM values and cuts them into 3 readable Strings

      huePtr->value = hue;
      ESPUI.updateControl(huePtr);
      saturationPtr->value = saturation;
      ESPUI.updateControl(saturationPtr);
      brightnessPtr->value = brightness;
      ESPUI.updateControl(brightnessPtr);
      
      Control* temp = ESPUI.getControl(hueSliderId);
      hueSlider(temp, 0);//Updates SaturationSlider to the correct Color
    }
}

This did run a lot better for a short time but then crashed (after 40 activations, but that sure as hell is better than 1).
Screenshot 2022-07-28 150019
Why is there so much String stuff going on? Maybe i need to skip the char arrays all together and paste the EEPROM contents directly into the Strings.

What is also unexplainable is that in a previous version, loading a blank sketch on the ESP and then loading the main sketch afterwards completely got rid of any issues. Until it restarted...

Why is there so much String stuff going on?

Because ESPUI sends updates as JSON to the web client, JSON that is assembled by the ArduinoJson library, which makes use of String. There is nothing wrong with using String in itself, one just has to be careful not to allocate it in tight loops etc. like a lot of beginner Arduino programmers do.

That stack trace really is suggesting to me that "you're out of RAM", although I will admit I've not had too much experience with debugging that on the 8266. Skimming over the rest of your code you aren't doing anything particularly egregious.

If you remove half the UI do the problems go away?

commented

No, commenting out the half of the UI doesn't work.

You think it would be better to remove every Serial.Print()? I tried that once and i think it ran better? Hard to know without feedback.
Because sometimes it crashes while getting EspClass::getHeapFragmentation().

No, commenting out the half of the UI doesn't work.

So I suggested this just because if you were running out of RAM this would be a quick hack to check it.

I've not ran that DNS server alongside my stuff. Can you try taking that out temporarily?

The key is here->1768 That means the web socket cannot allocate another buffer to send or receive and it will begin to have all sorts of system problems and random crashes. I went to great lengths in the ESPixelStick code to remove ALL TEXT STRINGS from ram. Every tim you use "Some text here" you use ram Doing the same thing using F("Some text here") makes any ram usage transient and is fine if a string is used in only one place. Moving the string to a const PROGMEM foo [] = "Some text here" reduces the flash size significantly and reduces the ram used to zero.

Next, the refactored version of the ESPUI (Still under review) significantly reduces the instantaneous ram usage by ESPUI and may help reduce your crashing further. However, it still looks like you have a memory leak someplace because you should never let your system get down to 1700 bytes of free ram.

commented

I've not ran that DNS server alongside my stuff. Can you try taking that out temporarily?

I was under the impression that it was necessary for ESPUI to work because of this example

you should never let your system get down to 1700 bytes of free ram

I am going to order a ESP32 and just hope that the code only crashes once in a blue moon.

Moving the string to a const PROGMEM foo [] = "Some text here" reduces the flash size significantly and reduces the ram used to zero.

I tried that but didn't get it to work using the Arduino example. Still going to try that again later.

commented

Could I change this setting to increase the heap?

image

You will notice that they all add up to 64k yes you can make a custom setting that gives more memory to heap.

I think you should give my new changes a try. They significantly reduce the level of traffic on the network.
https://github.com/MartinMueller2003/ESPUI

Yeah sorry I've not got around to looking at your stuff Martin. It's been very busy at work so I've been getting home and slobbing instead of looking at PRs :)

Yeah sorry I've not got around to looking at your stuff Martin. It's been very busy at work so I've been getting home and slobbing instead of looking at PRs :)

No worries. Its just been sitting for a while and I have been using it without issue so I am real comfortable that the changes are an improvement. And for things like this issue, it reduces the number of ws messages from a max of 8 large messages to one small trigger message.

@JD-GG Looking at the heap size output, there are a few markers needed to make it possible to interpret. You need something that indicates you are beginning your function, output the heap data, perform your operations, output the heap data again, output a closing marker. That way we can see if the issue happened in your function or outside your function. Something definitely turned your ram into swiss cheese, but it is difficult to tell what did it. If it was the button pushes, then there should have been a drift towards fragmentation with every activation. That looks like a step off a cliff. One iteration everything was fine, then next it was toast.

commented

Well this sure is interesting.

-Start- Heap Fragmentation (%): 1
Free Heap : 34456
Max Free Block Size : 34144
XX0255255
-afterEEPROM- Heap Fragmentation (%): 1
Free Heap : 34456
Max Free Block Size : 34144
-End- Heap Fragmentation (%): 8
Free Heap : 32520
Max Free Block Size : 30088
void presetAktivate(Control* sender, int value)//DONE Gets Safed Color from Label
{
    if(value == B_DOWN)
    {
      Serial.print(F("-Start- Heap Fragmentation (%): "));
      Serial.println(ESP.getHeapFragmentation());
      Serial.print(F("Free Heap : "));
      Serial.println(ESP.getFreeHeap());
      Serial.print(F("Max Free Block Size : "));
      Serial.println(ESP.getMaxFreeBlockSize());
      
      int i = static_cast<int>(sender->id - preset[0]);//Account for already existing IDs
      i /= 3;//Divide by number of IDs => Array Element
      if(i<0||i>=presetNum){
        Serial.println(F("prAktivate i not in range!"));
        return;
      }
      
      char hue[4];
      char saturation[4];
      char brightness[4];
      getPresetEEPROM(hue,saturation,brightness,i);//Gets EEPROM values and cuts them into 3 readable strings
      
      Serial.print(F("-afterEEPROM- Heap Fragmentation (%): "));
      Serial.println(ESP.getHeapFragmentation());
      Serial.print(F("Free Heap : "));
      Serial.println(ESP.getFreeHeap());
      Serial.print(F("Max Free Block Size : "));
      Serial.println(ESP.getMaxFreeBlockSize());
      
      huePtr->value = hue;
      ESPUI.updateControl(huePtr);
      saturationPtr->value = saturation;
      ESPUI.updateControl(saturationPtr);
      brightnessPtr->value = brightness;
      ESPUI.updateControl(brightnessPtr);
      
      Control* temp = ESPUI.getControl(hueSliderId);
      hueSlider(temp, 0);//Updates SaturationSlider to the correct Color
      
      Serial.print(F("-End- Heap Fragmentation (%): "));
      Serial.println(ESP.getHeapFragmentation());
      Serial.print(F("Free Heap : "));
      Serial.println(ESP.getFreeHeap());
      Serial.print(F("Max Free Block Size : "));
      Serial.println(ESP.getMaxFreeBlockSize());
      Serial.println();
    }
}

It takes a reference to a String, but you're passing a char * so a String is being implicitly constructed and referenced.

Is guess that is the String object generation @iangray001 was talking about

commented

I rewrote the function again so that it writes the EEPROM contents directly into the values stored in the control elements.
After further testing, I conclude the only thing that is consistent is the ESP dies. Every time.
The results are mixed. Sometimes you go into the function with more heap fragmentation and come out with less.

-Start- Heap Fragmentation (%): 22
Free Heap : 36048
Max Free Block Size : 2200
XX0255255
-afterEEPROM- Heap Fragmentation (%): 22
Free Heap : 36048
Max Free Block Size : 2200
-End- Heap Fragmentation (%): 12
Free Heap : 34112
Max Free Block Size : 2200

And sometimes it's the other way arround.

-Start- Heap Fragmentation (%): 6
Free Heap : 37368
Max Free Block Size : 35400
XX0255255
-afterEEPROM- Heap Fragmentation (%): 6
Free Heap : 37368
Max Free Block Size : 35400
-End- Heap Fragmentation (%): 8
Free Heap : 35344
Max Free Block Size : 32744

I'll update the Issue-Repository with my current code.

Often enough it crashes while updating the control elements.

-Start- Heap Fragmentation (%): 5
Free Heap : 37280
Max Free Block Size : 35656
404255255
-afterEEPROM- Heap Fragmentation (%): 5
Free Heap : 37280
Max Free Block Size : 35656
--------------- CUT HERE FOR EXCEPTION DECODER ---------------
Exception (28):

For a long time, I thought that the problem was updating multiple control elements in rapid succession. Could this be the root of the problem or should I just drop this hypothesis?

So you have run head first into one of the issues my branch fixes. Currently on the ESP8266 there is a max number of 8 updates you can perform before doing a yield. Each of those updates uses a significant chunk of memory. After that your updates are lost. The problem is the WebSocket layer is running in the same thread as your process and cannot actually send the messages until you exit the loop function or exit the callback. That is why you are seeing the heap go down and the fragmentation go up over the course of the execution of the function. My version limits the impact of this and makes the number of updates per callback unlimited. Alternatively you can do a yield after each update to allow the WebServer time to process your requests and release the memory. On the ESP you can do a yield by doing a usleep(10) but you run the risk of reentrancy issues (another invocation of the callback while you are in the callback).

Back to your results. While the use of heap in the callback is understood, the real interesting information is going to be the state of the heap between invocations of your callback.

That means the interesting data is: For 10 button pushes:

  • What was the exit heap status of the previous invocation.
  • What is the entry heap status on the current invocation.
  • Repeat until crash

If we do not see a recovery between invocations then something is killing your heap.

commented
-Start- Heap Fragmentation (%): 5
Free Heap : 37880
Max Free Block Size : 35992
586255255
-afterEEPROM- Heap Fragmentation (%): 5
Free Heap : 37880
Max Free Block Size : 35992
-End- Heap Fragmentation (%): 8
Free Heap : 35664
Max Free Block Size : 33096

-Start- Heap Fragmentation (%): 12
Free Heap : 37600
Max Free Block Size : 33096
340255255
-afterEEPROM- Heap Fragmentation (%): 12
Free Heap : 37600
Max Free Block Size : 33096
-End- Heap Fragmentation (%): 9
Free Heap : 35664
Max Free Block Size : 32448

-Start- Heap Fragmentation (%): 11
Free Heap : 37600
Max Free Block Size : 33744
999255255
-afterEEPROM- Heap Fragmentation (%): 11
Free Heap : 37600
Max Free Block Size : 33744
-End- Heap Fragmentation (%): 8
Free Heap : 35664
Max Free Block Size : 33096

-Start- Heap Fragmentation (%): 5
Free Heap : 37600
Max Free Block Size : 35992
586255255
-afterEEPROM- Heap Fragmentation (%): 5
Free Heap : 37600
Max Free Block Size : 35992
-End- Heap Fragmentation (%): 8
Free Heap : 35664
Max Free Block Size : 33096

-Start- Heap Fragmentation (%): 12
Free Heap : 37608
Max Free Block Size : 33096
340255255
-afterEEPROM- Heap Fragmentation (%): 12
Free Heap : 37608
Max Free Block Size : 33096

 ets Jan  8 2013,rst cause:4, boot mode:(3,0)

wdt reset

Can't get it to do more than 5 :(

I found that most of the time it crashes on the 2nd ESPUI.updateControl
Am going to try and download your branch and test it.
I am also going to test it with only updating one slider.

And what about that DNSServer?

commented

OKAY. So after doing this:

I am also going to test it with only updating one slider.

I was not able to crash my sketch with only activating colors. (Got about 100 activations before I stopped)

I also found that i have an issue with my EEPROM data getting all corrupted that probably lead to some crashes.

After downloading your REPO and reversing those changes I mentioned above, I can conclude that my issue is gone.

Holy frick. Didn't think that it would be so easy. @MartinMueller2003 thank you so so so so much. My girlfriend's birthday is in 2 days, so you can't even imagine how happy I am right now. I will further test my code, but the crashing is gone for now :DD

Glad to have been able to help.

Seems like this is resolved or did I get it wrong ? Otherwise I would close it if all is good

This is resolved with my changes.