MicrosoftResearch / Naiad

The Naiad system provides fast incremental and iterative computation for data-parallel workloads

Home Page:http://microsoftresearch.github.io/Naiad/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

System.OutOfMemoryException on large graph pagerank analysis

foxspy opened this issue · comments

I just want to run a pagerank program(Naiad/Examples/GraphLINQ/PageRank.cs) for graph analysis.When I run for a 1GB-Size graph,everything goes well; but when I run for a 3.6G-size graph(about 260,000,000 edges),the system prompted an exception followed:
Reading file C:\Naiad-release_0.5\rmat24.el
00:02:56.2408081, Graph 0 failed on scheduler 0 with exception:
System.OutOfMemoryException: Array dimension beyond the scope of support.
在 System.Collections.Generic.List1.set_Capacity(Int32 value) 在 System.Collections.Generic.List1.EnsureCapacity(Int32 min)
在 System.Collections.Generic.List1.Add(T item) 在 Microsoft.Research.Naiad.Frameworks.GraphLINQ.GraphCompactor1.OnReceive(Message2 message) 位置 C:\Naiad-release_0.5\Frameworks\GraphLINQ\GraphLINQ.cs:行号 1199 在 Microsoft.Research.Naiad.Dataflow.StandardVertices.UnaryVertex3.<>c.b__2_0(Message2 message, UnaryVertex3 vertex) 位置 C:\Naiad-release_0.5\Naiad\Frameworks\StandardVertices.cs:行号 446
在 Microsoft.Research.Naiad.Dataflow.Stage2.<>c__DisplayClass11_11.b__1(Message2 m) 位置 C:\Naiad-release_0.5\Naiad\Dataflow\Stage.cs:行号 383 在 Microsoft.Research.Naiad.Dataflow.ActionReceiver2.<>c__DisplayClass3_0.<.ctor>b__0(Message2 m, ReturnAddress u) 位置 C:\Naiad-release_0.5\Naiad\Dataflow\Endpoints.cs:行号 327 在 Microsoft.Research.Naiad.Dataflow.ActionReceiver2.OnReceive(Message`2 message, ReturnAddress from) 位置 C:\Naiad-release_0.5\Naiad\Dataflow\Endpoints.cs:行号 317
..........
how could I fixed this problem.I have tried to build the program on X86 and X64 platform,but could not fix the problem.
Thanks!

Hello,

I would recommend increasing the number of workers. This may cause the data to be distributed across more workers reducing the amount each worker needs to handle, and reducing the largest amount going to any one collection.

If this doesn't work, I would open up the debugger and see which collection is triggering the exception. The GraphLINQcs code is not privileged Naiad code, so you should be able to edit it all (or copy paste and make you own).

There are also important settings in your project to make sure that you are targeting the 64 bit runtime (in addition to which architecture you are targeting). I can't recall these off the top of my head, but they default to "32 bit runtime" which causes array lengths to top out at 2^31, which is probably what you are seeing.

Finally, I'm not sure that too many folks are supporting this project at the moment. Microsoft released the staff associated with the project, and they all have new jobs now. There is a Rust version of timely dataflow, and a PageRank implementation which you can read more about here.

Hope this helps!
Frank

Thanks!