Langchain based ask approaches not compatible with 0613 (Chat Completions)

Question

Langchain based ask approaches not compatible with 0613 (Chat Completions)

pamelafox opened this issue a year ago · comments

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

We recently changed to version from 0301 to 0613, since Azure isn't allowing new 0301 deployments. Unfortunately, 0613 only supports the new Chat Completions API, not the old Completions API, and the LangChain agents all assume use of the Completions API.

I have a branch that attempts to update the LangChain code to use Chat Completions, but am still QAing it.

aparnasharmav · Answer 1 · Tue Aug 22 2023 14:39:40 GMT+0800 (China Standard Time)

Hi @pamelafox : Will be waiting for an update on this, we are trying to deploy it in a corporate scenario, I managed to build an ARM template from the bicep template you have shared and removed role permissions required, I am using 2 models 1. Chat Gpt 35 turbo with version 0301 and chat gpt 35 turbo 16k with 0613 to deploy this but fail to deploy the template with an error saying 'standard' is not part of the 0613 version, if I remove the scale standard and deploy it fails the validation.

Pamela Fox · Answer 2 · Tue Aug 22 2023 21:59:08 GMT+0800 (China Standard Time)

To clarify:

The sample includes 4 different RAG (Retrieval-Augmented Generation) approaches: ChatReadRetrieveRead, ReadDecomposeAsk, ReadRetrieveRead, RetrieveThenRead. The two default approaches are ChatReadRetrieveRead and RetrieveThenRead, and they are both working very well with the Chat Completion APIs. The other two approaches use Langchain and the current code only works with the older Completion API (0301). Those approaches can be deleted from the code/UI, and the app would still work.

Is the problem that you definitely need to use those other two approaches for your particular use case, or is the problem that you can't deploy 0613? Do you have the latest main.bicep and cognitiveservices.bicep? The method of specifying capacity changed a few months ago.

aparnasharmav · Answer 3 · Thu Aug 24 2023 10:21:31 GMT+0800 (China Standard Time)

Hi Pamela, Thank you for getting back really quickly. Our org currently doesn't allow a bicep file download (.exe file download), so I had to convert the bicep file into an ARM template, remove permissions and roles, and then deploy. I am attaching the ARM template. Regarding your note on the 4 RAG approaches I might just need the first two, how do I modify the code, the app backend code in Python that I need to modify? Where can I get more information about this as the readme doesn't detail this? Lastly, deploy to Azure functionality was super useful on other examples can we expect something like that for this?
OpenAI_template_RAG_ARM_with dummy values.txt

aparnasharmav · Answer 4 · Thu Aug 24 2023 15:29:39 GMT+0800 (China Standard Time)

And on your comment on capacity, even though I use the template where the set capacity is 30, I still get an error while deploying that 120 tokens are necessary, Do I need to configure any additional settings?

Pamela Fox · Answer 5 · Thu Aug 24 2023 21:17:43 GMT+0800 (China Standard Time)

There is currently an issue (Azure/bicep-types-az#1660) where we can't deploy a capacity greater than what's remaining in our account, even if the deployments will replace whats in the account. So what I do in that case is go into the Azure OpenAI studio, edit each deployment so that it has 1 TPM, and then try azd up again.

Nikhil · Answer 6 · Sat Sep 02 2023 18:20:19 GMT+0800 (China Standard Time)

Hi @aparnasharmav, could you please provide more details on this -> "removed role permissions required". I suppose this is done for below requirement?

Your Azure Account must have Microsoft.Authorization/roleAssignments/write permissions, such as User Access Administrator or Owner.

I'd also like to get rid of this requirement if I can.

Thanks.

Pamela Fox · Answer 7 · Fri Oct 06 2023 00:12:43 GMT+0800 (China Standard Time)

No longer an issue as they have been removed.

Langchain based ask approaches not compatible with 0613 (Chat Completions)

This issue is for a: (mark with an x)

Minimal steps to reproduce

This issue is for a: (mark with an `x`)