segmentio / analytics-node

The hassle-free way to integrate analytics into any node application.

Home Page:https://segment.com/libraries/node

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issue using analytics within Lambda functions

mikeblanton opened this issue · comments

@pbassut can you confirm what specifically needs to be done to take advantage of this? We're working with one of your SA's and he pointed us to this fix to help with some visibility problems we're having in Lambda (Node 12x). We're triggering events off of Lambda's that process DynamoDB streams. I've got our flushAt set to 1. I'm also explicitly calling via an await. I can see in Lumigo that the call to segment is going out. I can see in my logs that the flush call completes, but I don't see confirmation in the logs (or in lumigo) that the calls actually finish.

For example, I've got these 2 helper functions:

module.exports.flush = async (logger) => {
  await analytics.flush(function (err, batch) {
    if (logger) {
      if (err) {
        logger.error(err);
        return;
      }
      logger.debug('Segment cache flushed', {batch});
    }
  });
}

module.exports.track = async (payload, logger) => {
  analytics.track(payload, function (err, batch) {
    if (logger) {
      if (err) {
        logger.error(err, payload);
        return;
      }
      logger.debug('Batch flushed', {batch, payload});
    }
  });
  if (logger) {
    logger.debug('Segment track call sent', {payload});
  }
}

In my logs, I see...

  • Segment track call sent
  • Segment cache flushed

But I never see Batch flushed.

Incidentally, I wound up wrapping the analytics call in a promise and setting flushAt to 1.

module.exports.flush = async (logger) => {
  return new Promise((resolve, reject) => {
    logger.debug('Flushing segment cache');
    analytics.flush(function(err, batch) {
      if (err) {
        logger.error(err);
        return reject(err);
      }

      logger.debug('Segment cache flushed', {batch});
      resolve();
    });
  });
}
commented

This is due to #309

flush does not guarantee that all inflight messages are sent before calling the given callback. Instead, flush simply sends a batch of queued messages and waits for only that batch's response before callback. ... This is contrary to common expectations that a "flush" function completely empties the buffer to the destination. Further, this means there is no way to know when both all queued and all inflight messages are full sent (fully flushed).

So both your initial and follow up code will not do what you want. Currently, the only way to be sure that all messages have been sent is to use the callbacks for every identify/track/etc.

Also, be careful of #308 and #310. I recommended using promises to de-bounce the callbacks.

commented

Realized today there is another option: disabling all automatic flush triggers like so:

segmentClient.flushAt = Infinity;
segmentClient.flushInterval = false;
segmentClient.maxQueueSize = Infinity;
segmentClient.flushed = true;

Note that some of these can not or should not be passed as options to the constructor as they will be ignored or overridden with defaults for falsey values. These are not exposed by the type definitions either, so it is likely they are not meant to be part of the public interface, and thus can break in future releases without notice.

Now only manually calling flush() will flush the queue. You can then either serialize flushes or track all flush callbacks to ensure completion.

Please check out the solution proposed here: #309 (comment)