jbcodeforce / panda-ai-fraud-demo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Interact with Panda dataframe with natural language

PandasAI is a Python library that makes it easy to ask questions to our data in natural language. It uses generative AI model to understand and interpret natural language queries and translate them into python code and SQL queries.

Creation of the app

  1. Get the Dataset from https://github.com/Fraud-Detection-Handbook/simulated-data-transformed.git under a data folder.

  2. Start python virtual env:

    python -m venv .venv
    source .venv/Scripts/activate
  3. Install needed libraries: pip install -r requirements.txt

  4. Create a .env file with the API KEY needed to access OpenAI or Anthropic

  5. Create a streamlit app

Streamlit app structure

  • The query to the Pandas dataset is via pandasAI's SmartDataframe which use a config to access to the llm

    llm = OpenAI(api_token=os.environ["OPENAI_API_KEY"])
    query_engine = SmartDataframe(
        df,
        config={
            "llm": llm,
            "response_parser": ResponseParser
        },
    )
    
    answer = query_engine.chat(query)

Execute the demo

  • Start the app

    streamlit run App.py
    
  • In the query entry text enter: "count the number of rows"

  • "Get the top 10 CUSTOMER_ID with the largest fraud amount (a fraud being TX_FRAUD=1)"

  • "Plot the amount of fraud for the top 10 CUSTOMER_ID"

  • "Plot the distribution of transaction amount for fraud versus non-fraud transactions"

Other implementation

  • Using Pandas Dataframe LangChain agent
  • LlamaIndex query engine

About


Languages

Language:Python 100.0%