porterehunley / RACplusplus

A high performance implementation of Reciprocal Agglomerative Clustering in C++

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory Overload with racplusplus Package and 300K 512-Dimensional Data Points on 377G Memory

lizcBold opened this issue · comments

Problem Description:
I am encountering a memory overload issue when using the "racplusplus" package to handle a dataset containing 300,000 data points, each consisting of a 512-dimensional vector. Despite having 377GB of available memory on my system, the package exhausts the memory during processing.

Could you please provide more details regarding the memory usage in Chapter 6 of your article - experimental evaluation? How much memory was required to work with the SIFT1M dataset?
Additionally, you mentioned selecting the batch size for calculating the distance matrix. I would like to know the values you used for this section - a number large enough for speedy results yet small enough to avoid memory overload.