timescale / timescaledb-docker-ha

Create Docker images containing TimescaleDB, Patroni to be used by developers and Kubernetes.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Weird UUID behavior after switching to timescaledb-ha

morinow opened this issue · comments

We have been running a database on the regular timescaledb:latest-pg14 image. We have a table (let's call it entities) where rows are identified by the columns 'TenantId' (uuid) and 'Id' (text).

Now comes the weird part, which we do not understand.

We switched to using timescaledb-ha:pg14-ts2.11 instead, because we want to use postgis. We correctly mounted the old data at /home/postgres... But suddenly, we saw that some queries were not finding entities anymore. When we were listing the entities filtered with WHERE TenantId = 'some_uuid' we were seeing the entities, but when we tried querying single ones with WHERE TenantId = 'some_uuid' AND Id = 'some_id', we were not matching the entities, even though they clearly existed.
Interestingly, when we cast the 'TenantId' colum to 'text' like so WHERE TenantId::text = 'some_uuid' it was working again!

Now, after switching back to the old timescaledb image, it is working again. The database encoding and collations were exactly the same (utf8 and en_us.utf8)

Do you have any idea what is going on here?

UPDATE:

Now seeing this strange behavior in other tables too, even after switching back to the original image.

Query plan for both image variants is the same:

+-----------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                                   |
+-----------------------------------------------------------------------------------------------------------------------------+
|Limit  (cost=0.28..2.49 rows=1 width=630)                                                                                    |
|  ->  Index Scan using "PK_entities" on entities c  (cost=0.28..2.49 rows=1 width=630)                                       |
|        Index Cond: (("TenantId" = '04534239-5509-48f1-983d-5b10fea922ec'::uuid) AND ("Id" = 'naEr_kQL4ECWovQM0fWlkA'::text))|
+-----------------------------------------------------------------------------------------------------------------------------+

Your problem is probably caused by a combination of PostgreSQL and TimescaleDB settings for collation and data type casting. When you run a query like WHERE TenantId = "some_uuid", PostgreSQL attempts to apply an implicit type cast by considering the string some_uuid as a uuid type, which frequently succeeds without issue. Problems emerge when you include an additional condition like AND Id = "some_id", though. The text "some_id" and the "uuid" type "TenantId" column were compared in this instance by PostgreSQL, but there was no match. The query fails because implicit casting between "uuid" and "text" is not allowed by PostgreSQL's stringent type checking. In your solution, WHERE TenantId::text ='some_uuid', you specifically cast TenantId to text guaranteeing that both sides of the comparison have the same data type (text). Due to the lack of strict type checking, this strategy solves the problem. Consider explicitly defining data types in your queries to more effectively manage this, such as by casting the constant value some_uuid to uuid. By doing this, consistency is guaranteed and unexpected behaviour caused by differences in implicit type casting and collation settings between PostgreSQL versions or configurations is avoided.

One important difference between the timescaledb and timescaledb-ha docker images is that the timescaledb image is based on Alpine, whereas the timescaledb-ha is based on Ubuntu. One notable difference between Alpine and Ubuntu is that Alpine uses musl libc, which has limited support for collation (see docker-library/postgres#327).

I suspect that queries which hit the index are the ones which are failing, because the index is corrupted. I would suggest that you rebuild all indexes and try running queries again.

@JamesGuthrie that makes sense. Tried it and it seems to work fine again. Thanks a lot!