microsoft / azurechat

🤖 💼 Azure Chat Solution Accelerator powered by Azure Open AI Service

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Azure AI Search extension not working

surendransuri opened this issue · comments

Hi, I'm trying to set up Azure AI search Extension to fetch the content from the index but the tool output it is not fetching any content from the index.

image

image

When I search for this question in azure AI search I am see able to see the results there.

image

I couldn't find whether the issue is.

For vector field in header section of extension creation I have provided only the vector field from the index.

Also, the index is created in separate resource group of Azure and this App is running in separate group is this issue is because of this? If it is then what all roles I need to assign for this search service

i had the same problem and solved it with a quick fix. but i think someone should have a closer look at it.
the problem is, that my search had the content in a different field as pageContent so my content got lost in the FormatCitations Function.

when i update the FormatCitations Function it works for me:
(citation-service.ts)

export const FormatCitations = (citation: any[]) => {
  const withoutEmbedding: DocumentSearchResponse[] = [];
  citation.forEach((d) => {
    withoutEmbedding.push({
      score: d.score,
      document: {
        metadata: d.document.metadata,
        pageContent: d.document.pageContent || d.document.content || d.document.chunk,
        chatThreadId: d.document.chatThreadId,
        id: "",
        user: "",
      },
    });
  });

  return withoutEmbedding;
};

pageContent: d.document.pageContent || d.document.content || d.document.chunk
this should include your possible fields with the content.

or just make sure that the content in your Azure Search is in the Field pageContent.

I'm having the same issue trying to connect to the index I've created using integrated vectorization. The wizard and python examples create the same index format, and when I connect the extension I get exactly the results in OP's screenshot.

I'm looking for an approach to use integrated vectorization via Indexer Skillsets to maintain indexes for this solution.

Solved - modified the index and the output mappings of my skillset

Solved - modified the index and the output mappings of my skillset

I'm also trying to use integrated vectorization, how did you modify the index?

@vlad-tsoy - I'm creating my index with these fields:

fields = [
    SearchField(name="id", type=SearchFieldDataType.String, key=True, filterable=True, sortable=False, facetable=False, analyzer_name="keyword"),
    #SearchField(name="user", type=SearchFieldDataType.String, sortable=False, filterable=True, facetable=False), #used for filtering
    #SearchField(name="chatThreadId", type=SearchFieldDataType.String, sortable=False, filterable=True, facetable=False), #used for filtering
    SearchField(name="pageContent", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
    SearchField(name="metadata", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
    SearchField(name="embedding", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile"),
    SearchField(name="parent_id", type=SearchFieldDataType.String, sortable=True, filterable=True, facetable=True)
]

And configured my skillset output mappings to match:

          mappings=[  
               InputFieldMappingEntry(name="pageContent", source="/document/pages/*"),  
               InputFieldMappingEntry(name="embedding", source="/document/pages/*/vector"),  
               InputFieldMappingEntry(name="metadata", source="/document/metadata_storage_name")  
           ],  

@vlad-tsoy - I'm creating my index with these fields:

fields = [
    SearchField(name="id", type=SearchFieldDataType.String, key=True, filterable=True, sortable=False, facetable=False, analyzer_name="keyword"),
    #SearchField(name="user", type=SearchFieldDataType.String, sortable=False, filterable=True, facetable=False), #used for filtering
    #SearchField(name="chatThreadId", type=SearchFieldDataType.String, sortable=False, filterable=True, facetable=False), #used for filtering
    SearchField(name="pageContent", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
    SearchField(name="metadata", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
    SearchField(name="embedding", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile"),
    SearchField(name="parent_id", type=SearchFieldDataType.String, sortable=True, filterable=True, facetable=True)
]

And configured my skillset output mappings to match:

          mappings=[  
               InputFieldMappingEntry(name="pageContent", source="/document/pages/*"),  
               InputFieldMappingEntry(name="embedding", source="/document/pages/*/vector"),  
               InputFieldMappingEntry(name="metadata", source="/document/metadata_storage_name")  
           ],  

Thank you, Michael!

Any tips or guides on how to setup the index in regards to pulling unstructured data stored in a Azure blob storage? I've tried to build an index, but having issues where the indexer doesn't actually pull any data when run with the suggested mappings.