datahub-project / datahub

The Metadata Platform for your Data Stack

Home Page:https://datahubproject.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not all Tags show up in the Global Search and Filter Page

gky249 opened this issue · comments

Describe the bug
version: 0.13.0
We have come across an issue with Tags where not all Tags are being listed when you filter on Tags object type in the global search and filter page.
In this list it is missing many of the other platform tags and custom tags that we have created and they are assigned to datasets also. For e.g. urn:li:tag:Teradata is assigned to many of our Teradata datasets and present in the backend metadata storage under globalTags aspect also, but still not showing up in the list of tags in the search and filter page. Note: the list showing on the UI is matching what is present in the backend metadata storage with urn:li:tag:*. Question is why arent all tags assigned to datasets stored in the backend metadata ?
The problem that this is creating is - when we try to create views or policies based on tag filter, these missing tags are not showing up. Only the tags which are present in backend with urn:li:tag:% are showing up.

To Reproduce
Steps to reproduce the behavior:

  1. Go to the global search and filter page
  2. Filter on Tags object type. The list will show X number of tags
  3. Verify whether all the tags present in your datasets show up on that filter list from Step 2
  4. Many tags which are otherwise assigned to datasets are missing from the Tag filter in the global search page.

Expected behavior
All tags present in the datahub instance across all datasets should show up in the search and filter page

Screenshots
image
image
image
image

Desktop (please complete the following information):

  • OS: Windows 11
  • Browser Chrome
  • Version 109.0.5414.75 (Official Build) (64-bit)

Additional context
These missing tags are causing problems when creating tag-based-policy or tag-based-views because they are missing from the global filter tag list.
@treff7es @chriscollins3456 @hsheth2

@gky249 how did you create these tags? Both ingestion and the UI should create tags appropriately.

None of the tags were manually created from UI. They are all part of the ingestion recipes in the Transformers section like below -

transformers:
  - type: "simple_add_dataset_tags"
    config:
      semantics: PATCH
      tag_urns:
        - "urn:li:tag:Teradata"
        - "urn:li:tag:Finland"

@hsheth2
There is no difference from our side related to setup of which tags to see and which tags to not see. And we can see those missing tags from UI in the metadata_aspect_v2 table under the globalTag aspect for each dataset, so we know they exist. Just the UI is not showing them from the global search & filter page.

It seems that transformer for addTags is not minting the tags -> @hsheth2 let's confirm that our transformers are appropriately minting the tags when adding them to assets. If we are doing this already, maybe it's an outdated version?

It seems like this could be a bug given that we've investigated and confirmed that the transformer should be minting these tags on behalf of the user.

@gky249 Can you confirm the CLI version as well (not just server) to ensure you're on v0.13.0?

datahub --version

On the latest version I am failing to reproduce.

In the screenshots below you can see two terms, one which inherits the other. They both have related entities that are related at the column-level - you can tell by Matches Column X

Screenshot 2024-04-17 at 10 30 29 AM Screenshot 2024-04-17 at 10 30 23 AM Screenshot 2024-04-17 at 10 30 19 AM

@jjoyce0510 your last comment seems to be about Terms, not Tags ?
And yes, we are on CLI 13 and datahub release 13 as well. We only recently noticed this issue when trying to create tag-based policies and views. But these tags have been in our system since v0.9.6.1

@jjoyce0510 any updates ?