typesense / typesense

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

Home Page:https://typesense.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Query containing two synonyms at once does not seem to match documents

alexeymaksakov-tomtom opened this issue · comments

Description

When query contains two different synonyms at once, non of synonyms seems to be considered while matching.
The document with matching synonym alternatives is only returned if both synonyms are provided in document's original form in the query , for instance '4th sw ave' or '4th sw avenue' won't match '4th southwest avenue' in collection where both synonyms for 'ave' and' sw' are registered.
Meanwhile query '4th southwest avenue' will return expected result with all the tokens matched.

Steps to reproduce

Two synonyms are registered in collection

{'synonym_ave', {"synonyms": ["av", "ave", 'avenue']}}
{'synonym_southwest', {"synonyms": ["southwest", "sw"]}}

Document with fields

{
 "sn_preferred_name": ["4th Avenue Southwest"],
 "no": "500"
}

is indexed
Queries

{'q': '500 4th ave sw',
'query_by'  : 'no, sn_preferred_name'}

or

{'q': '500 4th ave southwest',
'query_by'  : 'no, sn_preferred_name'}

won't return expected document, having following document on position no 1 instead

{
    "no": "500",
    "sn_preferred_name": [
        "4th Ave NE"
    ],
}

with a following highlight

    "highlights": [
        {
            "field": "sn_preferred_name",
            "indices": [
                0
            ],
            "matched_tokens": [
                [
                    "4th",
                    "Ave"
                ]
            ],
            "snippets": [
                "4th Ave NE"
            ]
        },
        {
            "field": "no",
            "matched_tokens": [
                "500"
            ],
            "snippet": "500"
        }
    ],

Expected Behavior

It is expected that the document with "sn_preferred_name": ["4th Avenue Southwest"] is returned first for the query '500 4th ave sw',

Actual Behavior

"4th Ave NE" record returned first, and many other records are preferred over expected record.

Metadata

Typesense Version: 0.25.1

OS: Linux