alexklibisz / elastiknn

Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.

Home Page:https://alexklibisz.github.io/elastiknn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Try re-using threadlocal arrays in ArrayHitCounter

alexklibisz opened this issue · comments

Background

The ArrayHitCounter class gets reinstantiated for every segment in every LSH query. It involves allocating an array of shorts of size equivalent to the number of docs in the segment. Maybe using a "pool" of threadlocal arrays would minimize time spent allocating the new arrays?

Deliverables

  • Modify ArrayHitCounter to take its arrays from a ThreadLocal "pool" of arrays
  • Benchmark
  • Thoroughly test, incl. some form of parallelized testing

Related Issues

No response

I took a first pass at this here: #600, latest commit at time of writing is: 47cede7

The implementation is relatively simple:

package com.klibisz.elastiknn.search;
/**
* A pool of ArrayHitCounters which can be checked out
* and reused as ThreadLocal variables.
*/
public class ArrayHitCounterPool {
// A private intermediate class that gets stored within the ThreadLocal
// and allows us to reuse or rebuild a new ArrayHitCounter.
private static class Builder {
private ArrayHitCounter ahc;
public ArrayHitCounter getArrayHitCounter(int capacity) {
// If it's the first call or the desired capacity exceeds the existing counter, we build a new one.
if (ahc == null || capacity > ahc.capacity()) ahc = new ArrayHitCounter(capacity);
// If the desired capacity is less or equal to existing, we reuse.
else ahc.reset();
return ahc;
}
}
private static ThreadLocal<Builder> builders = ThreadLocal.withInitial(() -> new Builder());
public static ArrayHitCounter getThreadLocal(int capacity) {
return builders.get().getArrayHitCounter(capacity);
}
}

I verified (with some logging) that it is actually clearing and reusing the arrays.

Unfortunately, it has not made much of an impact on the Fashion Mnist benchmark. The raw benchmark results are all close enough that it's not obvious anything actually changed:

Perhaps any time saved is spent clearing the arrays?

image

GC time and memory use are also about the same before and after: #600 (comment)

Closing, see previous comment