LinkedDataFragments / HDT-Node

Native bindings for Node.js to access HDT compressed triple files.

Home Page:http://ruben.verborgh.org/blog/2014/09/30/bringing-fast-triples-to-nodejs-with-hdt/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Concurrency protection

mielvds opened this issue · comments

Doing more than one operation at the same time (because of async) can quickly blow up the memory.
For instance, when doing a series of lookups in a for loop with searchTriples, this can more or less only be throttled using await.

Could we have a more graceful handling/exit, probably by:

  • queuing operations so only a limited number (one?) of CPP operations can be active simuntaneously is active;
  • reject promises when the queue is too large;
  • something else?

this can more or less only be throttled using await.

But that is exactly what the calling code should do, and it solves the issue, no?

Do you have an example of code that could cause issues?

This is issue is mostly for documentation so I don't forget. There is no imporant use case, no urgent solution necessary.

Possible problematic code could be this (not the best example):

var doc;
hdt.fromFile('./test/test.hdt')
  .then(function(hdtDocument) {
    doc = hdtDocument;

    for (predicate in predicates) {
    doc.searchTriples(null, predicate, null, { offset: 0, limit: 10 }).then(...);
    }
  });

From the calling code, it's not always easy to predict when you are going to run into trouble (you don't know how many results there are and even if you do, you don't know what will be a problem). The library could improve the dev experience by adding some throttling or at least rejecting too many calls.

I would argue the above is an issue in the calling code. A for loop is not an adequate solution in presence of asynchronous calls. You have to either code this with a callback function, or more easily with for await.

Native Node.js APIs, such as fs, will also not contain protections against this and similarly rely on the caller to do the right thing.

As such, I don't think that protections belong in the HDT module.

fair enough :)