valeriansaliou / node-sonic-channel

🦉 Sonic Channel integration for Node. Used in pair with Sonic, the fast, lightweight and schema-less search backend.

Home Page:https://www.npmjs.com/package/sonic-channel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Help] Errors when ingesting many objects (~2000) in bursts

sladkoff opened this issue · comments

I have a relatively simple use case and not a lot of data:

  • There's a scheduled task that runs every x hours to index some objects in sonic
  • It should remove any outdated data and re-insert the latest state of the objects
  • The procedure looks something like this:
    // flush the whole collection to get rid of potentially deleted objects
    await ingest.flushc(collection)
    
    // ...
    
    // for all our entities
    for (entity of entities) {
      // remove the object (this is a duplicate of the flushc call, it could be skipped)
      await ingest.flusho(collection, bucket, entity.id)
      // re-insert the object with latest data
      await ingest.push(collection, bucket, entity.id, entity.text)
    }

This looks fine in theory to me but I'm experiencing a problem that a lot of data that is apparently pushed by my application code is missing from the Sonic index. My investigations so far have led to some errors thrown by this node-sonic-channel (see questions below) so I'm opening this issue here.

Question(s) 1

I'm a little lost on how to use the Ingest connection over time.

In the above scenario, I'm iterating over n entries and doing stuff that might take some time. Should I open one connection for the whole procedure? As I understand the connection can be closed by timeout (?) or other reasons? Is there a recipe on how to reconnect and continue such a batch procedure in case the connection is closed? Or is it safer to open and close a connection per ingest.push call?

Question 2

What does this error mean and what should I do to prevent it?

Error: Offline stack is full, cannot stack more operations until Sonic Channel connection is restored (maximum size set to: 500 entries)

Probably comes from here.

Question 3

What does this error mean and what should I do to prevent it?

channel closed

Probably comes from here


All of this seems like a very simple use-case so I'm assuming that I'm doing something very wrong. I'd appreciate some help. Thanks!

Sonic has a backpressure safety mechanism, which is basically a kill-switch if there are WAY too many operations pending on the server side. It will abort the flooding client connection. This is on Sonic side, this cannot be changed as this would imply increasing the network-related buffers.

Now, node-sonic-channel also has a backpressure management algorithm (look for this.__emitQueue), which will internally queue pending tasks... until there are too many (in order to protect the running NodeJS process, as the node-sonic-channel library might be running on a process shared w/ eg an HTTP server). You can change this by increasing the emitQueueMaxSize option when constructing the Sonic Channel client.

Ok, thanks for the insight. We will continue using one connection per push for now because otherwise we need to implement some sort of throttling on our side in order not to reach the libs limits. This makes our process a bit slower but the implementation is simpler.