Getting large memory leaks

Question

Getting large memory leaks

hyperh opened this issue 6 years ago · comments

I'm getting some large memory leaks on my Express server with this library when doing server side rendering (SSR). Is anyone else getting the same thing?

Following this, I ran some tests with autocannon. https://www.npmjs.com/package/autocannon

  autocannon({
      url: 'http://localhost:3000',
      connections: 100,
      pipelining: 1,
      duration: 2 * 60
    })

The dotted blue line is when I took out raven-for-redux. As you can see, the memory leak seems to have disappeared after taking out raven-for-redux. The other tests are just attempted fixes, but that didn't have any effect.

Jordan Eldredge · Answer 1 · Thu Jan 04 2018 04:52:15 GMT+0800 (China Standard Time)

I’m not sure I fully understand all the different combinations you tested. Were you able to create a test that included Raven-for-redux but not raven-js?

Jordan Eldredge · Answer 2 · Thu Jan 04 2018 04:55:53 GMT+0800 (China Standard Time)

Put another way: what is leading you think that the leak is in Raven-for-Redux and not Raven-js?

One possible cause is that the Raven-for-Redux middleware will hold into a reference of the Raven instance which is passed to it as well as a reference to the store.

Could you share the code that is being used to setup the middleware?

hyperh · Answer 3 · Thu Jan 04 2018 04:59:15 GMT+0800 (China Standard Time)

@captbaritone The other combinations aren't particularly useful in themselves, just left them there as a comparison against the effect of taking out raven-for-redux.

One possible cause is that the Raven-for-Redux middleware will hold into a reference of the Raven instance which is passed to it as well as a reference to the store.

I have my own Raven middleware as well (createRavenUserContextMiddleware) that also takes in an instance of Raven as an argument, but taking that out had no effect on the memory issue.

Store initialization code:

createRavenMiddleware is the line for raven-for-redux.

export default (preloadedState = null, history) => {
  const sagaMiddleware = createSagaMiddleware();
  const routerMiddleware = createRouterMiddleware(history);

  const composeEnhancers = getDevToolsEnhancer();
  const Raven = setupRaven();

  const store = createStore(
    rootReducer,
    preloadedState,
    composeEnhancers(
      applyMiddleware(
        sagaMiddleware,
        analyticsMiddleware,
        routerMiddleware,
        createActionBuffer(REHYDRATE),
        createRavenUserContextMiddleware(Raven),
        createRavenMiddleware(Raven, {
          stateTransformer,
          filterBreadcrumbActions
        })
      ),
      autoRehydrate({
        log: true
      })
    )
  );
  initSagas(store, sagaMiddleware);
  Raven.setUserContext({
    email: pathOr(null, ['user', 'email'], preloadedState)
  });

  const isBrowser = typeof window !== 'undefined';
  if (isBrowser) window.onbeforeunload = () => handleRefresh(store);

  return store;
};

hyperh · Answer 4 · Thu Jan 04 2018 05:01:26 GMT+0800 (China Standard Time)

@captbaritone

I’m not sure I fully understand all the different combinations you tested. Were you able to create a test that included Raven-for-redux but not raven-js?

Wouldn't I need an instance of raven-js to use raven-for-redux?

Jordan Eldredge · Answer 5 · Thu Jan 04 2018 05:09:59 GMT+0800 (China Standard Time)

You could try passing it a stub:

Raven = {
    setDataCallback: () => {},
    captureBreadcrumb: () => {}
};

What does setupRaven currently do?

Are you using raven-js on your server, or raven (the Node version)?

hyperh · Answer 6 · Thu Jan 04 2018 05:32:10 GMT+0800 (China Standard Time)

@captbaritone

You could try passing it a stub:

Just tried passing in the Raven stub, the memory leak disappeared.

What does setupRaven currently do?

setupRaven code:

export default function setupRaven() {
  debug('setupRaven');
  const SENTRY_DSN_PUBLIC = isBrowser ? getDSN() : null;

  const VERSION =
    typeof window !== 'undefined'
      ? path(['__VERSION__'], window)
      : null;

  Raven.config(SENTRY_DSN_PUBLIC, {
    captureUnhandledRejections: true,
    release: VERSION,
    shouldSendCallback,
    autoBreadcrumbs: {
      console: true
    }
  }).install();
  return Raven;
}

Are you using raven-js on your server, or raven (the Node version)?

Using raven on server. However, this app is server side rendered, so the client side code (that uses raven-js) to initialize the Redux store (in order to retrieve the preloaded state) is also executed on a user navigating to my URL in the browser.

Jordan Eldredge · Answer 7 · Thu Jan 04 2018 05:39:51 GMT+0800 (China Standard Time)

Have you tested calling setupRaven and your using middleware, but not raven-for-redux?

hyperh · Answer 8 · Thu Jan 04 2018 05:48:09 GMT+0800 (China Standard Time)

@captbaritone Yup that's the blue dotted line in my graph. load-no-createRavenMiddleware

Jordan Eldredge · Answer 9 · Thu Jan 04 2018 06:01:24 GMT+0800 (China Standard Time)

Ah! I have an idea what it might be. raven-for-redux calls Raven.setDataCallback() and passes it a function that has the redux store bound into it. It also keeps a reference to the previous dataCallback so that it can also run any other callbacks that the user has provided.

This means that a new Redux store (and everything it references) is probably being retained for every request.

Could you try adding this to your setupRaven function:

Raven.setDataCallback(null);

hyperh · Answer 10 · Thu Jan 04 2018 06:27:31 GMT+0800 (China Standard Time)

@captbaritone

Could you try adding this to your setupRaven function:

setupRaven looks like this now:

export default function setupRaven() {
  debug('setupRaven');
  const SENTRY_DSN_PUBLIC = isBrowser ? getDSN() : null;

  const VERSION =
    typeof window !== 'undefined'
      ? path(['__VERSION__'], window)
      : null;

  Raven.config(SENTRY_DSN_PUBLIC, {
    captureUnhandledRejections: true,
    release: VERSION,
    shouldSendCallback,
    autoBreadcrumbs: {
      console: true
    }
  }).install();

  Raven.setDataCallback(null); // ADDED THIS

  return Raven;
}

Dotted grey line is where I set the data callback as null. Seems to have resolved the memory issues! Should probably be addressed by the library itself though. Will you be publishing an update for this bug?

Thanks for the quick replies and your hard work on this library! I really appreciate it, totally saved me.

Jordan Eldredge · Answer 11 · Thu Jan 04 2018 07:24:13 GMT+0800 (China Standard Time)

Honestly, I think this is kinda your app's responsibility. As far as raven-for-redux is concerned, you are asking it to listen to infinite redux stores, and (in the case of any error) add those stores' states as context to a single Raven instance.

That said, raven-for-redux does not really offer any way to stop listening. I guess the question is: Should it?

This is something of a strange use case, since I don't think raven-js could even report an error in a node environment.

Theoretically each raven-for-redux middleware that is created could offer some way to dispose of itself, but I'm having a hard time coming up with a valid use case for unsubscribing a given store that would not be better solved by just not subscribing the store in the first place.

Thoughts?

hyperh · Answer 12 · Thu Jan 04 2018 07:27:55 GMT+0800 (China Standard Time)

@captbaritone Yea your explanation makes sense, I'm fine with handling it in app.

However, I don't think the use case is that uncommon though, as React SSR is getting more and more common place. The only reason raven-js gets run on my server is because of SSR.

Perhaps just add some documentation around this (potential) issue in case another user experiences it?

Jordan Eldredge · Answer 13 · Thu Jan 04 2018 07:30:32 GMT+0800 (China Standard Time)

Currently raven-for-redux assumes it will only ever be called once. Maybe we could do two things:

Throw a warning if the user calls it more than once.
Add some documentation about using it in a SSR context.

Would you be willing to open PRs for either or both of those?

Jordan Eldredge · Answer 14 · Thu Jan 04 2018 07:31:08 GMT+0800 (China Standard Time)

Thanks for reporting this, by the way! Good bug 😀

hyperh · Answer 15 · Thu Jan 04 2018 07:31:55 GMT+0800 (China Standard Time)

@captbaritone Sure thing! I'll try to submit a PR for both 1 and 2. Thanks again for the great library!

Jordan Eldredge · Answer 16 · Thu Jan 04 2018 09:34:02 GMT+0800 (China Standard Time)

See #51 and #52. Let's start with #52.

Odin Ugedal · Answer 17 · Sat Jan 06 2018 01:08:24 GMT+0800 (China Standard Time)

Hi @captbaritone & @hyperh,

Isn't this a issue with the usage, and not the library? The raven-js should not be used in node, there is a separate package called raven for node. Since raven-js runs in a browser it has a "global" context. SSR will most likely run async, and it doesn't make sense to use the same context for all SSR requests.

From the sentry docs:

Note: If you’re using Node.js on the server, you’ll need raven-node.

It isn't possible to report exceptions with raven in node. Running this in node will result in a crash (because of a browser-only api):

import Raven from 'raven-js';
[...]
Raven.captureException(err);

The node version of raven does not support all the same "commands" as the web version. We/I use a custom mapper object to pass a node-version of raven to this middeware, making it work for both environments. Merging #54 will therefore warn about our intentional usage.

In my opinion, it is better to document how to use it (and how to not use it), instead of adding a warning like that. The documentation could say that the first argument should implement an interface consisting of captureBreadcrumb and setCallback, or be an object with those two attributes (needed for eg. flow).

Jordan Eldredge · Answer 18 · Sat Jan 06 2018 03:46:32 GMT+0800 (China Standard Time)

@odinuge Thanks for chiming in. I'm curious about your usage. Are you sharing a single (node) raven instances across requests? If so, you might have a similar memory leak to the one @hyperh uncovered. Additionally, the context that raven-for-redux would attach is probably not what you would actually want:

Breadcrumbs would include all actions dispatched by all stores across the lifecycle of the node process.
The "current state" would be the state of the most recently attached store, not necessarily the store that actually caused the error. (Maybe these will always be the same, I'm not up on node async stuff).

Odin Ugedal · Answer 19 · Tue Jan 09 2018 02:10:08 GMT+0800 (China Standard Time)

Hi @captbaritone

Here is basic a proof of concept (in code) of how it is possible to do it. https://gist.github.com/odinuge/3c9778e991621eb579d0d8ea5676365c
Haven't done much testing, but it looks like it works ok. 👍

There will be one raven instance, but with one "context" for each request. We use the async server express, so multiple requests can be executed simultaneously. raven (for node) and raven-js for browsers differ quite a bit in how they work, so it makes sense to have two different packages.

Here is a small benchmark:
Left: Implementation of the proof of consept over ⬆️
Right: Using raven-js

PS: The huge drop in the end is a forced gc. 😄

Jordan Eldredge · Answer 20 · Thu Jan 11 2018 01:17:15 GMT+0800 (China Standard Time)

@odinuge I would agree that this approach does avoid the memory leak, but I think it would have at least one confusing behavior:

Any exception would include every store's actions as breadcrumbs. It would be impossible to tell which breadcrumbs came from the current request's store.

Moreover, while this works, it requires pretty detailed knowledge of the inner workings of raven-js, raven and raven-for-redux in order to get right. And if you don't get it right, the failure mode could be very confusing and hard to discover (for example, the memory leak). Also, if the implementation of one of those libraries changes in the future, it could cause a subtle regression.

So, perhaps you are right. We should not simply disallow multiple calls. Instead we should think about how we can offer a reasonable to out-of-the-box solution for SSR.

I think what we really want is a way to get a per-store (or "per-middleware") captureException function which you can use, as you have, to explicitly catch errors at a request level.

This should probably be a separate function than createRavenMiddleware since it will not want to bind make any direct changes to the global Raven singleton.

I'll play with some things and see what I can come up with. If you have any ideas of how this pattern could be moved into raven-for-redux I would love to hear them.

Thanks again for the thought you've put into this!

Odin Ugedal · Answer 21 · Thu Jan 11 2018 02:14:00 GMT+0800 (China Standard Time)

Agree that we should think about how to make it an "easy" implementation that more or less works "out of the box". However, SSR is kinda hard. As you say, memory leaks are hard to find, and dealing with proper async is not easy (and sure I am no expert, atleast not in javascript and its strange scope and context).

Commiting to a bad/not that good solution now, would indeed limit the possibilities for this lib at a later stage.

My proof of consept do actually handle breadcrumbs properly. As you see on line 45, here: https://gist.github.com/odinuge/3c9778e991621eb579d0d8ea5676365c#file-raven-for-redux-ssr-example-js-L45, I define a context. becuase of the "implicit binding" of the functions in the UniversalRavenNode-class, it will add the breadcrumbs in the proper context. I did some testing with storing the breadcrumbs in a custom container, but that made it impossible to debug timeouts etc., since all breadcrumbs got the timestamp as the report. Using that approach without context, may also create some strange behavior when dealing with multiple request sending sentry report at once (have not digged that deep into the code, but don't think they have a that advanced API).

I guess it should also be possible to send the raven instance/custom to other middlewares, We don't have to handle error boundaries, since errors occurring when rendering (in SSR) is thrown by the renderToString function directly (source: https://reactjs.org/docs/error-boundaries.html#introducing-error-boundaries).

I have never worked with multiple redux-stores, but that would also have some of the same issues/problems as SSR.

The raven implementation in node is quite powerful (more info about the context here). However, i agree that it would be hard to force everyone to understand both the raven-node and the raven-js implementation. It would also make the lib less interesting, since it would become a lot harder to add.

All in all, it is nice that you also have some thoughts about this; and wan't to add some SSR functionality into the project (or just docs, if that makes more sense in the end). I will keep thinking, and post when I get any good ideas.

And again, thanks for your work on this awesome project. Keep it up 😄

Jordan Eldredge · Answer 22 · Sat Jan 27 2018 12:13:35 GMT+0800 (China Standard Time)

I've thought about this a (tiny) bit more, and here is where I stand:

In a SSR context we would want the following things:

A captureException function which, when called, would include all standard raven-for-redux context from a given Redux store.
Completely self contained (does not attach any references to the store to the Raven singleton)
Does not attach any context to uncaught exceptions (caught outside of the explicit captureException)

This should be plenty doable, the only challenge I see is the Breadcrumbs. Since the breadcrumbs are logged directly on the singleton, I think we will have to re-implement the breadcrumb capturing logic (adding the timestamp, enforcing the max length) and manually attach them to the exception inside our captureException. Maybe we could be super clever and merge these scoped breadcrumbs with any that happen to already exist from the global scope.

@odinuge What do you think of this solution?

Odin Ugedal · Answer 23 · Mon Jan 29 2018 03:52:55 GMT+0800 (China Standard Time)

Hi,

That sounds like a good idea. The lack of breadcrumbs will however be a huge miss, atleast for me. After my understanding of the sourcecode of raven, it is impossible to add breadcrumbs with custom timestamp (and custom breadcrumbs when reporting at all). I have however investigated some more, and I have found a solution almost like yours. It has working breadcrumbs, and should solve most of the problems discussed. It is however not perfect yet 😆

https://gist.github.com/odinuge/c38d3656ed52aef2cf3f7a049ad27dab

Jordan Eldredge · Answer 24 · Mon Jan 29 2018 06:42:58 GMT+0800 (China Standard Time)

@odinuge Thanks for your thoughts on this.

One problem I see with your solution is that breadcrumbs from all the different requests are attached to the global Raven, so a logged exception might include action breadcrumbs from an unrelated request/Redux store. As you said though, I think it's impossible without being able to set the dataCallback.

That said, I think I have a working prototype which would allow the problem to be solved entirely within raven-for-redux.

We add an additional global option, which specifies wether the Redux context (state, last action, breadcrumbs) should be attached to all uncaught ("global") exceptions.
We add a captureException(e) method to the middleware (a function that has a method attached to it is a bit weird, but... this is JavaScript, and I couldn't think of a cleaner way to give the user access both). Independent of the global option, this will log an exception with the Redux context.

For SSR, you would set the global option to false and explicitly capture exceptions raised by the rendering of that page via middleware.captureException(e).

Browser environments would continue to work exactly as they had before since global would default to true.

This change requires:

Tracking middleware breadcrumbs manually outside of Raven and merging them back in with the natively logged breadcrumbs right before we send/log the exception. This is possible inside that dataCallback.
Finding a way to dynamically change the data callback without introducing a memory leak.
Introducing a WeakSet, which I'm not sure how to do in a library, where I can't control which polyfills are present.

For number 2, I have found the following solution:

Define a _dataCallback variable in the global scope. When a middleware is created, check to see if we've already set dataCallback on the Raven instance which was passed. (this is where the WeakSet comes in). If not, we set the dataCallback to a function which closes over _dataCallback. Then we can change _dataCallback dynamically depending upon which middleware's context we want to capture. It would look something like this:

const attachedRavens = new WeakSet();
let _dataCallback = null;

const createMiddleware  = (Raven, options) => {
  if (!attachedRavens.has(Raven)) {
    Raven.setDataCallback((data, original) => {
      data = _dataCallback ? _dataCallback(data) : data;
      return original ? original(data) : data;
    });
    attachedRavens.add(Raven);
  }
}

Then we can change _dataCallback dynamically depending upon which middleware's context we want to capture. For example, our middleware.captureException method might look like:

middleware.captureException = e => {
  const original = _dataCallback;
  _dataCallback = middlewareDataCallback; // The middleware-specific callback
  Raven.captureException(e);
  _dataCallback = original;
};

It's a bit confusing, but I think it's worth exploring, since it covers the requirements I outlined in my previous comment, and allows all the complexity to live inside raven-for-redux.

Here's the prototype in PR form: https://github.com/captbaritone/raven-for-redux/compare/ssr?expand=1

I'll try to refine it a bit in the coming days. Thoughts?

Odin Ugedal · Answer 25 · Fri Feb 02 2018 05:07:08 GMT+0800 (China Standard Time)

That looks like a nice start @captbaritone! It still looks like it would require some more work in order to properly add raven-node support. The main concern is that raven-node doesn't support setDataCallback - one of the key concepts used in this middleware.

I like the breadcrumb handling, but it looks like the raven-js lib will override them during transmission if there are breadcrumbs in the global context: https://github.com/getsentry/raven-js/blob/dd10b7439551fdfdd9077d08452429e04676f3d2/src/raven.js#L1760. However, when using the setDataCallback, it will work as expected in your code. It also looks like that is the case (overridden) in raven-node too: https://github.com/getsentry/raven-node/blob/master/lib/client.js#L227.

Adding the captureException function is nice, but if you/we choose to store data inside of the middleware; i would suggest having a better API for testing, debugging and custom implementations. Here are some examples of things that can be useful.

middleware.getContext() - returns all the saved context that will be sent during an exception
middleware.captureBreadcrumb(breadcrumb) - Should be possible because the middleware stores them
middleware.captureException(error, options) - equal to the standard API (here).
middleware.captureMessage(msg, options) - equal to the standard API (here)
middleware.clearContext() - clear all the saved context.

Jordan Eldredge · Answer 26 · Fri Feb 02 2018 05:49:20 GMT+0800 (China Standard Time)

raven-node doesn't support setDataCallback.

https://github.com/getsentry/raven-node/blob/master/lib/client.js#L511-L520

That said, it's not in the documentation for either library. I should add some automated tests that test against ravent-node to make sure I actually know what I'm talking about here 😁

I see your point about the other methods that folks might want. Saying "use our method instead of Raven's" is kinda gross, and you bring up a good reason why. It starts us on the road to having to mirror their entire API surface.

For the capture[Exception|Message] methods we could generalize it to something like:

middleware.captureInMiddlewareContext(() => {
    Raven.captureMessage(msg, options);
});

Basically we just need to be able to do some setup and tear-down.

But I don't have a good answer to the captureBreadcrumbs one.

I'm less concerned about the debug/introspection ones. We can add those if/when someone has a concrete use-case.

Odin Ugedal · Answer 27 · Fri Feb 02 2018 06:14:55 GMT+0800 (China Standard Time)

Yeah, it is "supported", but it doesn't work the way we need it to. Every request will need a separate callback in the middelware, and the raven-node setDataCallback sets it as a global callback. It doesn't work with Raven's inbuilt context handling. I guess that the idea behind the callback is just to have one (or several nested) pure functions to parse the data. This middleware uses it to add custom data, and that doesn't work with the current implementation inside raven-node.

The middleware.captureInMiddlewareContext looks like a smart idea! But shouldn't it be a normal function instead of an arrow function (to make it run inside the context of the middleware, and not the place the function is created) like this?:

middleware.captureInMiddlewareContext(function() {
    Raven.captureMessage(msg, options);
});

I also do however think it isn't necessary to store the breadcrumbs inside the middleware for ssr, since the breadcrumbs are handled by the context it is executed in. All requests will get their own context (using node domains), so that should work well. I have however never worked with multiple stores, so I have no experience with that. Do you have?

Yeah, raven-node support should be tested before saying the middleware works well with it.