Read Me First

RAG Example

Chat client call with Question/Anwering (aka RAG) and ChatMemory configurations:

var response = chatClient.prompt()
    .user("How does Carina work?")
    .advisors(new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()))
    .advisors(new PromptChatMemoryAdvisor(chatMemory))
    .call()
    .chatResponse();

The traces would looke like:

On call the chat-client calls:

the Before question-answer-advisor. Internally it usees the pg-vector-store to retrieve the similar documents and the open-ai-embedding-model to encode the input user question. Internally it uses the OpneAiApi REST client.
the Before prompt-chat-memory-advisorto retrieve the history and store the user message.
then the chat-client uses the openai-chat-model (gpt-4o) perform the chatcompletion request. Later delgates to inner REST client (OpenAiApi).
finally it vists the After QA and memory advisors and returns the response.

Function Calling Examples

Call the funciton paymentStatus for 3 different transaciton. But enforse non-parallel mode. So for each transaction the LLM will have to return and spearate tool call masage.

    String response = chatClient.prompt()
        .options(OpenAiChatOptions.builder()
            .withParallelToolCalls(false).build())
        .user("What is the status of my payment transactions 001, 002 and 003?")
        .functions("paymentStatus")
        .call()
        .content();

This produces traces like:

Diagram shows that 3 consequative tool calls are performed before the final reust.

tzolov / ai-observability-demo

Read Me First

RAG Example

Function Calling Examples

Random links

About

Languages