itemsapi / itemsjs

Extremely fast faceted search engine in JavaScript - lightweight, flexible, and simple to use

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Assymetry in aggregations doc_count calculation

SergeyRe opened this issue · comments

Developing with itemsjs i found strange assymetry in doc_count calculation for first and second facets
To reproduce this artefact i made fork of your jsfiddle demo
i exclude third facet and cutted dataset to 4 items
and made next combination of Tags,Actors fields:
prison,Tim Robbins
mafia,Tim Robbins
prison,Christan Bale
mafia,Christan Bale
2 facets are symmetrical and simalar behavior is expected -see my demo
but.... if you check 2 different facet values -you ll see strange assymentry in doc count calculation for different facets
printscreen supplied

Hey, please check if with conjunction: true you are getting desired results

  aggregations: {
    tags: {
      title: 'Tags',
      conjunction: true,
      size: 10
    },
    actors: {
      title: 'Actors',
      conjunction: true,
      size: 10
    }

More examples here: https://github.com/itemsapi/itemsjs/blob/master/docs/configuration.md

Yes it is symmetry now. But not the right symmetry.
i will try to explain -better on examples:
I have 2 projects with facet navigation
https://www.roonready.com/ -its pure javascript -i got it somewhere
https://www.roonready.ru/ -its elastic search realization
Sorry -there are 2 differnet langauges -but similar data set and facets
If you explore it - you will see different behavior on each case
More right is elastic version
pure javascript is like "conjuction=true" -not so good
sorry , if not so clear

It does not matter if it is Elastic, Solr, Algolia, ItemsJS or other search engines. The faceted search results should be the same everywhere. It's just a math

In ItemsJS you can make conjunctive and disjunctive facets. Conjunctive is usually great if you have many values per field (i.e. tags, brands) and disjunctive usually if there is one value per field (i.e. country)

Please prepare a reproduction case (where you see an issue) with small dataset, configuration, input and your desired results (like here #77 (comment)). It will be very helpful to understand if the problem is in the configuration or in ItemsJS per se

ok i will try
let i have dataset

{
    "items": [
      { "a": 1, "b": 3 },
      { "a": 1, "b": 4 },
      { "a": 2, "b": 3 },
      { "a": 2, "b": 4 }
    ]
  }

Configuration for facets

aggregations: {
    a : {
      title: 'Tags',
      size: 10
    },
    b: {
      title: 'Actors',
       size: 10
    }

i make request

filters: {
        a: [1],
        b: [3]
      }

result for now

{
  "data": {
    "items": [{ "a": 1, "b": 3 }],
    "agregations": {
      "a": {
        "buckets": [
          {
            "doc_count": 1,
            "key": 1,
            "selected": true
          },
          {
            "doc_count": 1,
            "key": 2,
            "selected": false
          }
        ],
        "name": "a",
        "position": 2,
        "title": "Tags"
      },
      "b": {
        "buckets": [
          {
            "doc_count": 1,
            "key": 3,
            "selected": true
          },
          {
            "doc_count": 0,
            "key": 4,
            "selected": false
          }
        ],
        "name": "b",
        "position": 1,
        "title": "Actors"
      }
    }
  }
}

desireable result

``{
  "data": {
    "items": [{ "a": 1, "b": 3 }],
    "agregations": {
      "a": {
        "buckets": [
          {
            "doc_count": 1,
            "key": 1,
            "selected": true
          },
          {
            "doc_count": 1,
            "key": 2,
            "selected": false
          }
        ],
        "name": "a",
        "position": 2,
        "title": "Tags"
      },
      "b": {
        "buckets": [
          {
            "doc_count": 1,
            "key": 3,
            "selected": true
          },
          {
            "doc_count": 1,
            "key": 4,
            "selected": false
          }
        ],
        "name": "b",
        "position": 1,
        "title": "Actors"
      }
    }
  }
}

the difference is last doc_count: 1 !!!!
Explanation:
doc_count for facet value have to show to users: How many items will be found as soon this value will be active ( or will be added to already found items in conjunctive facet case) So if i take value 2 for "a" - it will be found -1 item : {a:2,b:3 } it is clear from

 {
            "doc_count": 1,
            "key": 2,
            "selected": false
          }

So the same expected for value 4 of b -to have doc_count:1 , as making it active produces 1 item to be found {a:1,b:4}

@SergeyRe ok, there seems to be a bug! In ItemsJS 1.x there is your desired result and in ItemsJS 2.x not. I'll need a few days for debugging and fixing it.

Finally - bug with 2.1.5
2.0.0-alpha.1 - it ok
thid fiddle https://jsfiddle.net/SergeiRe/m1ogqsh2/18/ now with it and works fine

It seems to be ok for your case until 2.1.3. After that I've refactored code responsible for facets calculations a little bit. I hope there will be simple fix

yes 2.1.2 is the last to work fine

Please review version 2.1.8 if it is ok. This version passes previous tests and your case.
This is your first fiddle with new version https://jsfiddle.net/cigol/1gso409e/ so seems to be ok

Keep in mind that conjunction for facets is true by default now so you need to set them as false in your case (#84)

Thanks a lot. Works fine in my project (~1000 items)