askorama / orama

🌌 Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!

Home Page:https://docs.orama.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Distinct values for number facets

alexander-js opened this issue · comments

Problem Description

I am building a facet filter UI that gets populated by the facets property returned from search results. The current API for dealing with number facets forces you to make assumptions about the data by supplying one or more ranges :

{
  count: 3,      // Total number of ranges
  values: {
    '0-3': 5,    // Number of documents that have a value between 0 and 3 (inclusive)
    '3-7': 15,   // Number of documents that have a value between 3 and 7 (inclusive)
    '7-10': 80,  // Number of documents that have a value between 7 and 10 (inclusive)
  }
}

This API assumes that I know what kind of number ranges I'm dealing with ahead of time. In my case, there is a large set of facets with potentially wildly varying ranges. Instead, I'd like to be able to reason about each distinct number value without having to supply any ranges. It'd be preferable if the results looked like enums :

{
  count: 5,      // Total number of distinct number values
  values: {      // Object keys are number values, object values are counts.
    2: 3,
    3: 2,
    5: 15,
    8: 75,
    9: 5
  }
}

Proposed Solution

Allow for the same kind of configurations that are valid for enums. i.e either an empty object or this:

Property Type Default Description
size number 10 `Number of values to return.
order string DESC Order of the values. Can be either ASC or DESC.
limit number 100 Maximum number of values to consider.
offset number 0 Number of values to skip.

Alternatives

I've considered just representing my numbers as enum in my schema instead, but this gets very awkward as I'd still like to be able to use number operators when filtering on the facets.

Additional Context

No response

@allevo can we maybe discuss this and see how to prioritize? Looks like a nice idea to me

@micheleriva i'd like to contribute if you are willing to accept a PR.

@micheleriva let me know if you have made any decision(or link to any discussion) about the number facet api, i was planing to keep it similar to string,

export interface NumberFacetDefinition {
  ranges?: { from: number; to: number }[],
  limit?: number,
  offset?:number,
  sort?: FacetSorting,
  size?:number
}

where

Property Type Default Description
order string DESC Order of the values. Can be either ASC or DESC.
limit number 10 Maximum number of values to return.
offset number 0 Number of values to skip.
size number or undefined undefined Maximum number of values to consider, if not provided all distinct values will be considered. facet count will be Math.min(count,size)

I'm not sure if size is needed or beneficial in any way as with limit and offset size can be controlled by consumers. let me know if you are okay with this contract and do you want the size as well.