iriscouch / follow

Very stable, very reliable, NodeJS CouchDB _changes follower

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

process crash when using pause() resume() for large database

jlc467 opened this issue · comments

On a db of ~1.1 million records, i call follow on it like this:

follow({db:"https://example.iriscouch.com/bigdatabase", include_docs:true}, function(error, change) {
    var feed = this
    feed.pause()
    setTimeout(function() { feed.resume() }, 500)
})

Memory climbs rapidly until a GC error crashes process at around 1.5GB of memory usage.

Using node debugger, heap snapshot reveals json strings of changes (which is the entire doc since i'm passing include_docs: true) is what is accounting for the high mem usage and eventual crash of process.

Is this what is known as backpressure ? If i do away with the pause() resume() the issue goes away, but i need to be able to pause() resume() to do some async stuff with each change in sequence.

Just wondering if anyone can explain my issue and/or potential solutions. Thanks

Just curious (I may use this for something)... did you ever figure this out?

did you try adding explicit returns in the feed and timeout functions?

I'm leaning towards writing my own promise based, lib that can handle paging with document limits or one that is still event based but has a throttling option..

...or adding query_params.limit when using a query_params.feed of 'longpoll'?

my guess is that by the time you've paused, the entire change feed is already streaming to the app's memory but not firing the callback for each one yet. I could be wrong, but it seems like it grabs all changes since since regardless of how many have occurred.

I'd be curious to know if the above options help you out.

@jordancardwell we ended using nano.db.changes .

Something like:

  1. Grab a 1000 at a time with limit while passing the last sequence # processed (body.last_seq) to since.
  2. loop results, do promise async stuff
  3. Repeat forever.

We store the last sequence # processed so when process dies, we can pick up where we left off.

Works perfect so don't see us attempting follow again, though appreciate the ideas.

Would be interested in hearing any related updates with your approach!

Any news about this issue ? This is quite annoying for a "Very stable" changes follower :-/
I guess fixing it wouldn't be so hard for original developers − anyone here ?
Many thanks :)

FWIW : my DB has 24 million docs, I experienced the exact same issue as the OP (confirmed by node debugger). Eventually, launching nodejs with --max_old_space_size=13000 (13 GB RAM) did the trick: the program does not crash anymore. This is a gigantic amount of RAM for almost nothing, but it works.