Interact with Panda dataframe with natural language

PandasAI is a Python library that makes it easy to ask questions to our data in natural language. It uses generative AI model to understand and interpret natural language queries and translate them into python code and SQL queries.

Creation of the app

Get the Dataset from https://github.com/Fraud-Detection-Handbook/simulated-data-transformed.git under a data folder.

Start python virtual env:

python -m venv .venv
source .venv/Scripts/activate

Install needed libraries: pip install -r requirements.txt
Create a .env file with the API KEY needed to access OpenAI or Anthropic
Create a streamlit app

Streamlit app structure

The query to the Pandas dataset is via pandasAI's SmartDataframe which use a config to access to the llm

llm = OpenAI(api_token=os.environ["OPENAI_API_KEY"])
query_engine = SmartDataframe(
    df,
    config={
        "llm": llm,
        "response_parser": ResponseParser
    },
)

answer = query_engine.chat(query)

Execute the demo

Start the app
```
streamlit run App.py
```
In the query entry text enter: "count the number of rows"
"Get the top 10 CUSTOMER_ID with the largest fraud amount (a fraud being TX_FRAUD=1)"
"Plot the amount of fraud for the top 10 CUSTOMER_ID"
"Plot the distribution of transaction amount for fraud versus non-fraud transactions"

Other implementation

Using Pandas Dataframe LangChain agent
LlamaIndex query engine

jbcodeforce / panda-ai-fraud-demo

Interact with Panda dataframe with natural language

Creation of the app

Streamlit app structure

Execute the demo

Other implementation

About

Languages