Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.

Home Page:https://azure.microsoft.com/products/search

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Langchain based ask approaches not compatible with 0613 (Chat Completions)

pamelafox opened this issue · comments

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

We recently changed to version from 0301 to 0613, since Azure isn't allowing new 0301 deployments. Unfortunately, 0613 only supports the new Chat Completions API, not the old Completions API, and the LangChain agents all assume use of the Completions API.

I have a branch that attempts to update the LangChain code to use Chat Completions, but am still QAing it.

Hi @pamelafox : Will be waiting for an update on this, we are trying to deploy it in a corporate scenario, I managed to build an ARM template from the bicep template you have shared and removed role permissions required, I am using 2 models 1. Chat Gpt 35 turbo with version 0301 and chat gpt 35 turbo 16k with 0613 to deploy this but fail to deploy the template with an error saying 'standard' is not part of the 0613 version, if I remove the scale standard and deploy it fails the validation.

To clarify:

The sample includes 4 different RAG (Retrieval-Augmented Generation) approaches: ChatReadRetrieveRead, ReadDecomposeAsk, ReadRetrieveRead, RetrieveThenRead. The two default approaches are ChatReadRetrieveRead and RetrieveThenRead, and they are both working very well with the Chat Completion APIs. The other two approaches use Langchain and the current code only works with the older Completion API (0301). Those approaches can be deleted from the code/UI, and the app would still work.

Is the problem that you definitely need to use those other two approaches for your particular use case, or is the problem that you can't deploy 0613? Do you have the latest main.bicep and cognitiveservices.bicep? The method of specifying capacity changed a few months ago.

Hi Pamela, Thank you for getting back really quickly. Our org currently doesn't allow a bicep file download (.exe file download), so I had to convert the bicep file into an ARM template, remove permissions and roles, and then deploy. I am attaching the ARM template. Regarding your note on the 4 RAG approaches I might just need the first two, how do I modify the code, the app backend code in Python that I need to modify? Where can I get more information about this as the readme doesn't detail this? Lastly, deploy to Azure functionality was super useful on other examples can we expect something like that for this?
OpenAI_template_RAG_ARM_with dummy values.txt

And on your comment on capacity, even though I use the template where the set capacity is 30, I still get an error while deploying that 120 tokens are necessary, Do I need to configure any additional settings?

There is currently an issue (Azure/bicep-types-az#1660) where we can't deploy a capacity greater than what's remaining in our account, even if the deployments will replace whats in the account. So what I do in that case is go into the Azure OpenAI studio, edit each deployment so that it has 1 TPM, and then try azd up again.

commented

Hi @aparnasharmav, could you please provide more details on this -> "removed role permissions required". I suppose this is done for below requirement?

Your Azure Account must have Microsoft.Authorization/roleAssignments/write permissions, such as User Access Administrator or Owner.

I'd also like to get rid of this requirement if I can.

Thanks.

No longer an issue as they have been removed.