Use of yield

Question

Use of yield

Adam5Wu opened this issue 6 years ago · comments

Thanks a lot for your effort in keep improving this port!

The recent commit introduced yield to prevent wdt reset, however it would prevent use of this library in context other than g_cont, such as in lwip callback (AsyncTCP).

I think there are two alternatives that are more compatible with diverse execution contexts, could you please consider:

system_soft_wdt_feed
This one feeds the dog so it won't bark.
optimistic_yield
This one will only yield in the g_cont, and won't cause panic in other context.

Earle F. Philhower, III · Answer 1 · Sat Mar 17 2018 09:22:49 GMT+0800 (China Standard Time)

If I use optimistic_yield instead of yield, will that cause WDT trouble for AsyncTCP? The EC key exchange, even at 160MHz, can take over (WDT_timeout)ms in BearSSL library code.

Earle F. Philhower, III · Answer 2 · Sat Mar 17 2018 09:47:51 GMT+0800 (China Standard Time)

The latest push has optimistic_yield() in it. Seems fine in main Arduino code, let me know if you need some other logic to support AsyncTCP.

Zhenyu Wu · Answer 3 · Sun Mar 18 2018 04:05:14 GMT+0800 (China Standard Time)

Thanks a lot! While using optimistic_yield prevents immediate panic in callback context, I found, as you described, sometimes too heavy computation fires watchdog reset.

The amount of computation seems to depend on server. tls.mbed.org seems to cause the most heavy computation. I haven't observed reset problem with other sites (but I haven't tried many, either).

Adding system_soft_wdt_feed(); before the "Run formulas" loop seems to aleviate the reset problem. So it seems both optimistic_yield and system_soft_wdt_feed are needed. Technically, it is an either-or -- if yield is possible, do that, otherwise, feed the dog.

So maybe the following block can be used instead?

if (cont_can_yield(&g_cont)) {
  yield();
} else {
  system_soft_wdt_feed();
}

However, even after that, connection cannot be established for me - tcpdump reveals that the server seems to give a very short tcp retry and timeout when doing handshake, around 2 seconds for each resend and 3 resend before closing connection.

Due to the nature of AsyncTCP, all data is handled in the LWIP callback context, which means that when computation hits, the TCP layer cannot ack fast enough...

I will try implement and workaround, basically queue all data during handshake into another timer callback context, and see if it helps.

Zhenyu Wu · Answer 4 · Sun Mar 18 2018 04:11:49 GMT+0800 (China Standard Time)

BTW, I think you can move the yield block outside of the "Run formulas" loop.
My tracing shows that, it is not the loop that runs too long. Each execution of run_code() finish fairly fast, but it was run ~100 times and total is too long.
So you can safely reduce the check frequency and reduce some overhead. 😄

Zhenyu Wu · Answer 5 · Sun Mar 18 2018 10:55:50 GMT+0800 (China Standard Time)

LOL it turned out my problem is CPU power, after all.

I have successfully implemented offloaded handshake in AsyncTCP, and I confirmed packets during handshake are acknowledged as fast as possible, but still the connection was cut (from server side) before handshake can be completed.

Then I tried to run at 160MHz, and the connection went right through. 😄

So apparently there are some servers just impossible to connect with default 80Mhz clock.

There are some hardware accelerated solutions, such as ATECC508A / ATECC608A.
Bearssl base-code seems well written and very modular, I guess it won't be hard to utilize those.
I put it on my backlog, will look into it when I got time, someday... :P

Earle F. Philhower, III · Answer 6 · Sun Mar 18 2018 12:30:07 GMT+0800 (China Standard Time)

You found the same thing I did while testing. At 80MHz tls.mbed.org times out from the server at around 5 seconds.

The workaround, if you really want to run at 80MHz, is to drop the EC key exchange algorithms from the enabled suites in the br_ssl_engine_set_suites call, or move the RSA ones above all the EC ones so they get called preferentially.

Earle F. Philhower, III · Answer 7 · Tue May 15 2018 11:05:00 GMT+0800 (China Standard Time)

I think this is good now. I'll open something about the memory allocation to support AsyncTCP to track that.