pgai simplifies the process of building search, and Retrieval Augmented Generation(RAG) AI applications with PostgreSQL.
pgai brings embedding and generation AI models closer to the database. With pgai, you can now do the following directly from within PostgreSQL in a SQL query:
- Create vector embeddings for your data.
- Retrieve LLM chat completions from models like Claude Sonnet 3.5, OpenAI GPT4o, Cohere Command, and Llama 3 (via Ollama).
- Reason over your data and facilitate use cases like classification, summarization, and data enrichment on your existing relational data in PostgreSQL.
Here's how to get started with pgai:
- Everyone: Use pgai in your PostgreSQL database.
- Install pgai.
- Use pgai to integrate AI from your provider:
- Ollama - configure pgai for Ollama, then use the model to embed, chat complete and generate.
- OpenAI - configure pgai for OpenAI, then use the model to tokenize, embed, chat complete and moderate. This page also includes advanced examples.
- Anthropic - configure pgai for Anthropic, then use the model to generate content.
- Cohere - configure pgai for Cohere, then use the model to tokenize, embed, chat complete, classify, and rerank.
- Extension contributor: Contribute to pgai and improve the project.
- Develop and test changes to the pgai extension.
- See the Issues tab for a list of feature ideas to contribute.
Learn more about pgai: To learn more about the pgai extension and why we built it, read this blog post pgai: Giving PostgreSQL Developers AI Engineering Superpowers.
The fastest ways to run PostgreSQL with the pgai extension are to:
-
Create your database environment. Either:
Run the TimescaleDB Docker image, then enable the pgai extension.
pgai is available for new or existing Timescale Cloud services. For any service, enable the pgai extension.
To install pgai from source on a PostgreSQL server:
-
Install the prerequisite software system-wide
-
Python3: if running
python3 --version
in Terminal returnscommand not found
:-
Standard installation: download and install the latest version of Python3.
-
Postgresql plugin for the asdf version manager: set the
--with-python
option when installing PostgreSQL:POSTGRES_EXTRA_CONFIGURE_OPTIONS=--with-python asdf install postgres 16.3
-
-
Pip: if running
pip --version
in Terminal returnscommand not found
:-
Standard installation: use one of the pip supported methods.
-
Virtual environment: usually, pip is automatically installed if you are working in a Python virtual environment. If you are running PostgreSQL in a virtual environement, pgai requires several python packages. Set the
PYTHONPATH
andVIRTUAL_ENV
environment variables before you start your PostgreSQL server.PYTHONPATH=/path/to/venv/lib/python3.12/site-packages \ VIRTUAL_ENV=/path/to/venv \ pg_ctl -D /path/to/data -l logfile start
-
-
PL/Python: follow How to install Postgres 16 with plpython3u: Recipes for macOS, Ubuntu, Debian, CentOS, Docker.
macOS: the standard PostgreSQL brew in Homebrew does not include the
plpython3
extension. These instructions show how to install from an alternate tap. -
pgvector: follow the install instructions from the official repository.
These extensions are automatically added to your PostgreSQL database when you Enable the pgai extension.
-
-
Make this
pgai
extension:make install
-
Connect to your database with a postgres client like psql v16 or PopSQL.
psql -d "postgres://<username>:<password>@<host>:<port>/<database-name>"
-
Create the pgai extension:
CREATE EXTENSION IF NOT EXISTS ai CASCADE;
The
CASCADE
automatically installspgvector
andplpython3u
extensions.
Now, use pgai to integrate AI from Ollama and OpenAI. Learn how to moderate and embed content directly in the database using triggers and background jobs.
pgai is still at an early stage. Now is a great time to help shape the direction of this project; we are currently deciding priorities. Have a look at the list of features we're thinking of working on. Feel free to comment, expand the list, or hop on the Discussions forum.
To get started, take a look at how to contribute and how to set up a dev/test environment.
Timescale is a PostgreSQL database company. To learn more visit the timescale.com.
Timescale Cloud is a high-performance, developer focused, cloud platform that provides PostgreSQL services for the most demanding AI, time-series, analytics, and event workloads. Timescale Cloud is ideal for production applications and provides high availability, streaming backups, upgrades over time, roles and permissions, and great security.