typesense / typesense

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

Home Page:https://typesense.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature Request] Support querying joined fields

Napam opened this issue · comments

Feature request: support querying joined fields

Hi, Typesense has been a great product for me and my team. It is exciting to see the new features in the recent 26.0 release (especially the JOINs). However a missing feature we really wish for is being able to do a "search in joined fields". That is, in the docs at https://typesense.org/docs/26.0/api/joins.html#one-to-one-relation where you do:

curl "http://localhost:8108/multi_search" -X POST \
        -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
        -d '{
          "searches": [
            {
              "collection": "books",
              "include_fields": "$authors(first_name,last_name)",
              "q": "famous"
            }
          ]
        }'

which resulted with

{
  "document": {
    "id": "0",
    "title": "Famous Five",
    "author_id": "0",
    "authors": {
      "first_name": "Enid",
      "last_name": "Blyton"
    }
  }
}

we really hoped that you could actually search for the name of the author in that query, like this (see the query text):

curl "http://localhost:8108/multi_search" -X POST \
        -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
        -d '{
          "searches": [
            {
              "collection": "books",
              "include_fields": "$authors(first_name,last_name)",
              "q": "enid blyton"
            }
          ]
        }'

Our current workaround

Our workaround now is that we have to include the first name and last name of the author in the book entries themselves. Which by itself is not difficult to do. It becomes complex when you change a name of the author, then you have to update all the author's books, but also imagine more complex scenarios where you have to resolve a larger graph of dependencies. Then one needs to actually traverse the graph and update everything based on the relations. Doing so in the application layer is quite slow (and in the worst case a O(n^2) operation), and all of this would be solved with being able to query joined fields.

We also tried using synonyms for this, but it didn't really seem to work. Using the book-author example we tried to define a synonym mapping the author name to a book id (of which we have indexed), but that didn't work. E.g. we defined a synonym "enid" -> the_book_id, then searching enid returned nothing.

Hi @Napam I believe this can be achieved using the "filter_by" field instead of the "q" field in the query to search using the author name from the books collection itself as follows,

Query:
curl "http://localhost:8108/multi_search" -X POST
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"
-d '{
"searches": [
{
"q": "*",
"collection": "books",
"include_fields": "$authors(first_name,last_name)",
"filter_by": "$authors(first_name:enid&&last_name:blyton)"
}
]
}'

Response:

{
"facet_counts": [],
"found": 2,
"hits": [
{
"document": {
"author_id": "1",
"authors": {
"first_name": "Enid",
"last_name": "Blyton"
},
"id": "2",
"title": "The Island of Adventure"
},
"highlight": {},
"highlights": []
},
{
"document": {
"author_id": "1",
"authors": {
"first_name": "Enid",
"last_name": "Blyton"
},
"id": "1",
"title": "The Mountain"
},
"highlight": {},
"highlights": []
}
],
"out_of": 10,
"page": 1,
"request_params": {
"collection_name": "books",
"first_q": "",
"per_page": 10,
"q": "
"
},
"search_cutoff": false,
"search_time_ms": 0
}

Hi @Napam I believe this can be achieved using the "filter_by" field instead of the "q" field in the query to search using the author name from the books collection itself as follows,

Query:

curl "http://localhost:8108/multi_search" -X POST -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{ "searches": [ { "q": "*", "collection": "books", "include_fields": "$authors(first_name,last_name)", **"filter_by": "$authors(first_name:enid&&last_name:blyton)"** } ] }'

Response:

{ "facet_counts": [], "found": 2, "hits": [ { "document": { "author_id": "1", "authors": { "first_name": "Enid", "last_name": "Blyton" }, "id": "2", "title": "The Island of Adventure" }, "highlight": {}, "highlights": [] }, { "document": { "author_id": "1", "authors": { "first_name": "Enid", "last_name": "Blyton" }, "id": "1", "title": "The Mountain" }, "highlight": {}, "highlights": [] } ], "out_of": 10, "page": 1, "request_params": { "collection_name": "books", "first_q": "_", "per_page": 10, "q": "_" }, "search_cutoff": false, "search_time_ms": 0 }

Hi @abiudmani, thanks for the tip, but this solution won't work in the context of simple search bar. Using a simple search bar we won't directly specify who the author is, we just want to search "enid", and it should just figure out all the books written by Enid.

Also, using the filter_by is not typo tolerant. You need to match exactly, which again is not what we want in a general search bar. E.g. I want to be able to go to the "Books" page and straight up search "enad bliton" (notice the typos) and I should get all books authored by "Enid Blyton".

+1 for this feature. it would be incredibly helpful for us to avoid the kind of duplication mentioned by the OP - but I guess there were good reasons for not including it with the join work... @kishorenc are there any existing plans to implement this?

Yes we have plans to implement this, it's already on our roadmap for join.

this would be awesome!

+1 for this