LearnBoost / cluster

Node.JS multi-core server manager with plugins support.

Home Page:http://learnboost.github.com/cluster

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

workers start to consume 100% CPU and stop working

tisba opened this issue · comments

Hey guys,

I'd could use some help/input on a problem I'm currently trying to debug. I'm using cluster to implement quite a simple [1] HTTP server and I'm not quite sure yet if the problem I'm seeing is my fault, or a cluster/node issue.

In production, even under quite normal load (~1k req/s) workers start to saturate a cpu core and not doing anything (strace told me so). If I let it continue to run without restarting, at some point no working workers are left :-/ Even if the request rate gets down over night the misbehaving workers keep using 100% cpu each.

Unfortunately I haven't much success at reproducing the problem on my local machine. The current "solution" is to restart cluster regularly.

[1] The server is just looking at the requests params, JSON-decodes a cookie, updates it and write a few bytes response.

hmm i definitely have not seen this behaviour

I'd be interested in ideas how to debug this. Since this is a comercial project I'm not allowed to share the code, but I hope that I can reproduce the issue maybe with code that I can share.

maybe try some of the v8 profiler / heap inspection tools

Yeah, I can try to use the profiler once I mange to reproduce this issue. Profiling the production system is maybe a bad idea :) The memory usage is pretty low (around 30-40M) and constant.