pgvector missing in Docker image ghcr.io/postgresml/postgresml:2.8.2
remote4me opened this issue · comments
I am trying to use the docker image. My environment: Ubuntu 22.04 with GPU, docker
I got these errors:
ERROR: access method "ivfflat" does not exist (when creating index)
ERROR: type "vector" does not exist (when using ::vector
in select statemen)
What I did:
docker run --rm -it \
-v postgresml_data:/var/lib/postgresml \
-v postgresml_postgresdata:/var/lib/postgresql \
--gpus all \
-p 5499:5432 -p 8000:8000 \
ghcr.io/postgresml/postgresml:2.8.2 \
sudo -u postgresml psql -d postgresml
-
Connected with SQL client to port 5499
-
I want to reproduce steps described in "Vector database", see https://github.com/postgresml/postgresml/?tab=readme-ov-file#vector-database
-
SELECT pgml.load_dataset('tweet_eval', 'sentiment');
-
Created table with embeddings:
CREATE TABLE tweet_embeddings AS
SELECT text, pgml.embed('distilbert-base-uncased', text) AS embedding
FROM pgml.tweet_eval;
- Creating index fails:
CREATE INDEX ON tweet_embeddings USING ivfflat (embedding vector_cosine_ops);
--
ERROR: access method "ivfflat" does not exist
1 statement failed.
- Using
::vector
fails:
WITH query AS (
SELECT pgml.embed('distilbert-base-uncased', 'Star Wars christmas special is on Disney')::vector AS embedding
)
SELECT * FROM items, query ORDER BY items.embedding <-> query.embedding LIMIT 5;
--
ERROR: type "vector" does not exist
Position: 113
SELECT pgml.embed('distilbert-base-uncased', 'Star Wars christmas special is on Disney')::vector AS embedding
^
1 statement failed.
- Some additional info:
SELECT extname, extversion FROM pg_extension;
extname | extversion
---------+------------
plpgsql | 1.0
pgml | 2.8.2
- More details:
SELECT pgml.version();
version
2.8.2 (dd7c74909bdf10cd5d39faf4429df8ba9748fd30)
Documentation (see https://postgresml.org/docs/product/vector-database) say literally this:
If you're using our Cloud or our Docker image, your database has pgvector installed already.
Well... I am using your latest Docker image, and...
CREATE EXTENSION vector;
this however leads to:
postgresml=# \d+ tweet_embeddings
Table "public.tweet_embeddings"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+--------+-----------+----------+---------+----------+-------------+--------------+-------------
text | text | | | | extended | | |
embedding | real[] | | | | extended | | |
Access method: heap
postgresml=# select * from tweet_embeddings;
postgresml=# CREATE INDEX ON tweet_embeddings USING ivfflat (embedding vector_cosine_ops);
ERROR: operator class "vector_cosine_ops" does not accept data type real[]
You'll need to alter the column type to a vector
to use pgvector indexes.