python: Race condition with client.subscriptions_by_channel
AndrewNolte opened this issue · comments
Description
It seems like there needs to be a lock around some of these functions. I think a channel was added or removed while looping through the channels that needed to be broadcast.
[ERROR] 2022-02-14 16:53:57.092 [:7]: Set changed size during iteration
[ERROR] 2022-02-14 16:53:57.096 [:7]: Traceback (most recent call last):
File "/tmp/Bazel.runfiles_gerdmrts/runfiles/x/pmx/rover/app/foxglove_bridge/foxglove_bridge.py", line 59, in broadcast
packet.SerializeToString(),
File "/tmp/Bazel.runfiles_gerdmrts/runfiles/x/pmx/rover/app/foxglove_bridge/foxglove_websocket/server.py", line 170, in send_message
for sub_id in subs:
RuntimeError: Set changed size during iteration
-
Version:
Latest, the line numbers are off because of our formatter. -
Platform:
Ubuntu 20.4, Python 3.7 (back-ported types)
Steps To Reproduce
It can probably be reproduced by having one thread broadcast data on a channel, while another repeatedly adds and deletes that channel. It's the first time I've seen this race condition after using foxglove for a couple weeks.
Expected Behavior
No run time error
Actual Behavior
rare race condition
Hi, can you please share some of your threading code? The server is currently not written to be thread-safe, and you will need some kind of synchronization. One way I'd recommend is to use asyncio's call_soon_threadsafe
method. Example usage of call_soon_threadsafe with a queue: https://gist.github.com/jtbandes/c00f01a6d156a223cfd0f409a52f87db
I do agree there's a possible bug, since the server has some await
s inside these for loops. But if you have sample code / steps to reproduce that would be helpful!
Ok thanks for the clarification! I just added locks on the calling side. The way I have the code set up is I have call backs for when topics are added/deleted on our end to add/remove channels on foxglove. Then a loop that broadcasts cached messages at a certain interval.
I have call backs for when topics are added/deleted on our end to add/remove channels on foxglove
Yep, assuming these callbacks are happening in a separate thread from your async with FoxgloveServer
, you will need to do something like call_soon_threadsafe
to add/remove the channels from the server thread. Maybe we can add some automatic validation that the methods are being called from the correct thread to help avoid these issues.
I'm having similar issues. Do either of you have some concrete implementation of this in code anywhere?
Just created an example threaded server: #42
Let me know if this example is helpful!
This makes perfect sense. Very comprehensive example, thank you!
Using the example and a small client script I was also able to reproduce the Set changed size during iteration
issue, so I'll put up a fix for that soon.