django / asgi_ipc

IPC-based ASGI channel layer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot receive on channel after restarting. Bug?

danielniccoli opened this issue · comments

I have an issue and I am not sure if that is by design. When my program restarts it does not receive any more messages on a channel.

I prepared an example which you find further down.. At the 30th iteration I am simulating the restart of the program by simply doing this:

channel_layer_receive = None
channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")

After that the script keeps printing (None, None) although it is still sending on the other channel layer.

Is that by design or a bug?

Example

import asgi_ipc as asgi

channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")
channel_layer_send = asgi.IPCChannelLayer(prefix="my_prefix")

i = 0

while i < 10:
    i += 1
    msg = "Message %s" % i
    try:
        channel_layer_send.send("my_channel", {"text": msg})
        print("Sending %s" % msg)
    except asgi.BaseChannelLayer.ChannelFull:
        print("Dropped %s" % msg)
        pass

    print(channel_layer_receive.receive(["my_channel"]))

    if i == 5:
        channel_layer_receive = None
        channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")

print("Done!")

Output

Sending Message 1
('my_channel', {'text': 'Message 1'})
Sending Message 2
('my_channel', {'text': 'Message 2'})
Sending Message 3
('my_channel', {'text': 'Message 3'})
Sending Message 4
('my_channel', {'text': 'Message 4'})
Sending Message 5
('my_channel', {'text': 'Message 5'})
Sending Message 6
(None, None)
Sending Message 7
(None, None)
Sending Message 8
(None, None)
Sending Message 9
(None, None)
Sending Message 10
(None, None)
Done!
Exception ignored in: <bound method MemoryDict.__del__ of <asgi_ipc.MemoryDict object at 0x7f59a875c390>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/asgi_ipc.py", line 311, in __del__
posix_ipc.ExistentialError: No shared memory exists with the specified name
Exception ignored in: <bound method MemoryDict.__del__ of <asgi_ipc.MemoryDict object at 0x7f59a88a6748>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/asgi_ipc.py", line 311, in __del__
posix_ipc.ExistentialError: No shared memory exists with the specified name

This is not by design, so it's probably a bug - while technically channel layers are allowed to drop messages, dropping 70 left in the queue is a bit much.

I would recommend using the Redis channel layer if you want something more reliable, it's much more proven.

I have only a small project on one machine. Redis would be an overhead I'd like to avoid. But thanks for the hint, though!

Oh, and I was not dropping 70 messages. I edited/condensed the code to make it more obvious what is happening. I also get an exception at the end. Not sure why, though. When I debug the code line per line I don't get one. Maybe an issue with execution speed or something?

The issue is caused by

self.shm.unlink()
.

unlink() marks the shared memory for destruction once all processes have unmapped it.
Source: http://semanchuk.com/philip/posix_ipc/

Not exactly sure how this works here because within a single process we have two SharedMemory objects that map the same shared memory. However if I unlink() one of the objects, the shared memory gets destroyed although the second objects has it still mapped.

Now this happens, minus the part where it says "after the last shm_unlink()":

"Even if the object continues to exist after the last shm_unlink(), reuse of the name shall subsequently cause shm_open() to behave as if no shared memory object of this name exists (that is, shm_open() will fail if O_CREAT is not set, or will create a new shared memory object if O_CREAT is set)."
Source: http://www.opengroup.org/onlinepubs/009695399/functions/shm_unlink.html

The quick fix is to simply not call unlink(), but then the shared memory needs to be unlinked manually by calling unlink_shared_memory(name).

Initially I had the issue with send() and receive() being in two separate scripts/processes. I will have to test whether the issue is actually the same.

Urgh, yes, that seems to be what this is; it behaves differently if it's two inside one process versus two in different processes.

I'm still very much tempted to try out a sqlite-based backend as a replacement for this shared memory stuff, given that the performance testing we did showed this was surprisingly slow.