Support Log Insights for Google Cloud AlloyDB for Postgres
premist opened this issue · comments
Hello,
I'm evaluating Google Cloud AlloyDB for Postgres, which is a fully PostgreSQL compliant database like Amazon Aurora.
The basic installation of collector works fine, however collector is not picking up logs from Cloud Pub/Sub, probably because resource.type
is different from Cloud SQL PostgreSQL.
Here is an example log entry payload.
{
"textPayload": "2022-05-20 05:39:01.827 UTC [2081]: [135-1] db=,user= LOG: [g_vacuum.c:851] Autovacuum worker memory: 65536(kb)",
"insertId": "REDACTED",
"resource": {
"type": "alloydb.googleapis.com/Instance",
"labels": {
"location": "us-central1",
"cluster_id": "REDACTED",
"resource_container": "projects/REDACTED",
"instance_id": "REDACTED"
}
},
"timestamp": "2022-05-20T05:39:01.827876Z",
"severity": "INFO",
"labels": {
"NODE_ID": "nfq2",
"CONSUMER_PROJECT": "REDACTED"
},
"logName": "projects/REDACTED/logs/alloydb.googleapis.com%2Fpostgres.log",
"receiveTimestamp": "2022-05-20T05:39:02.646346720Z"
}
Read through the codebase a bit, and I think input/system/google_cloudsql/logs.go can be modified in order to support AlloyDB.
@premist Thanks for reaching out!
Yes, I think you are correct that this should be easy to support. Could you confirm which log_line_prefix
setting you have active on the AlloyDB instance? (the [g_vacuum.c:851] Autovacuum worker memory: 65536(kb)
looks a bit non-standard, but that might just be extra debug output they provide)
@lfittl I can't find a way to set log_line_prefix
. Maybe it's Google's own autovacuum implementation that works on their distributed storage layer?
Attached screenshots for illustration purposes.
I made a small tweak and trying to build and run collector to see if basic log transport functionality works.
@lfittl I can't find a way to set
log_line_prefix
.
Could you try running SHOW log_line_prefix
on a Postgres connection to the database?
Ah, that works! Here's an output.
%m [%p]: [%l-1] db=%d,user=%u
Ah, that works! Here's an output.
%m [%p]: [%l-1] db=%d,user=%u
Excellent, thanks!
The good news is, this is a supported log line prefix for the collector, so you should be able to get data flowing into pganalyze.
However, from your screenshot, it appears that Google's team has modified the Postgres log output logic a bit, since they are prefixing log messages with the source code file (e.g. [analyze.c:830]
) for the autovacuum log in the above example.
This will cause a problem with our log handling, since the regular expressions we use for matching log lines won't match. We can customize this (probably in the GCP log handler), but it'll require a small patch (essentially I'm thinking we just strip the [filename:line]
portion from the GCP log line before passing it to our parsing logic).
That's great news.
I managed to get AlloyDB log entries to be picked up by the collector, here is a latest diff:
premist@1387f74
Basically, here are log attributes relevant to AlloyDB entries:
resource.labels.resource_container
which contains project ID in aproject/project_id
format- Alternatively,
labels.CONSUMER_PROJECT
also seems to contain project ID, withoutproject/
prefix.
- Alternatively,
resource.labels.cluster_id
andresource.labels.instance_id
which groups all servers (HA standby and read pool instances)- Both have the same value, at least for now.
labels.NODE_ID
which I think is the ID of servers Google is using internally to execute query
With the modification above, I can see logs appear on pganalyze output, which I guess will be solved once GCP log handler is modified to expect the new format.