Expose Conn.preparedStatements to TraceQueryStart

Question

Expose Conn.preparedStatements to TraceQueryStart

Thiht opened this issue 10 months ago · comments

Thibaut Rousseau commented 10 months ago

Is your feature request related to a problem? Please describe.
I'm currently writing a custom Prometheus tracer, and I want to expose the following:

query
duration
success/failure

I tried doing so using TraceQueryStart / TraceQueryEnd following the example provided in tracelog. I noticed that in TraceQueryStart, when trying to access data.SQL, we only get pgx_0, pgx_1, etc. (we're using the stdlib adapter)

Describe the solution you'd like
I'd like Conn.preparedStatements to be exposed to TraceQueryStart.

Either directly:

- 	preparedStatements map[string]*pgconn.StatementDescription
+	PreparedStatements map[string]*pgconn.StatementDescription

or indirectly:

func (c *Conn) GetPreparedStatement(name string) *pgconn.StatementDescription {
 	return c.preparedStatements
}

Describe alternatives you've considered
An alternative I have in mind is to implement TracePrepareStart to keep a map[name]sql on my side but it's a bit annoying, and I'm not sure name is unique across all conns when using a pool

Additional context

Jack Christensen · Answer 1 · Sat Oct 07 2023 23:40:08 GMT+0800 (China Standard Time)

Exposing PreparedStatements seems reasonable, but out of general principle I try to avoid expanding the interface. One potential issue is that it would expose cached statements as well as explicitly prepared statements. That might not be a bad thing, but it would need some thought.

So I'd like to at least wait on that for further input or consideration.

However, there are two other possibilities that might resolve this issue.

First, as of bbe2653 Prepare and the query functions support using the SQL text as the name of the prepared statement. pgx recognizes this usage and deterministically chooses the actual prepared statement name. The stdlib adapter wasn't using this new functionality, but I just introduced it in 0f0d236. This should mean you get the underlying SQL text.

Second, it might be preferable for TraceQueryStartData to include the prepared statement name and the underlying SQL query rather than requiring a tracer to do the lookup itself. This would require reordering a few operations in the query functions. There are a couple implications of this though. 1. The trace wouldn't be called at the beginning of the query function, instead it would be called after this lookup. But presumably it is only a few nanoseconds and wouldn't matter. 2. This might also entail moving the query rewriting functionality before the tracing. It's not clear whether the tracer should trace the SQL the calling application provided or the ultimate SQL sent to PostgreSQL.

Thibaut Rousseau · Answer 2 · Sun Oct 08 2023 19:32:36 GMT+0800 (China Standard Time)

Thanks for this detailed answer! Completely agree with not expanding the public interface if there are other solutions.

I just introduced it in 0f0d236. This should mean you get the underlying SQL text

This seems like it indeed solves the tracing question (at least for my use case using the stdlib adapter), but doesn't it come with the same problem as pointed out in this issue #1753 (comment)?

I don't think this is feasible. What happens if you Close one of those three statements? The other two would be broken.

Reading 0f0d236 it seems like the cache is correctly deallocated via (s *Stmt) Close() so maybe it works as expected after all!

The trace wouldn't be called at the beginning of the query function, instead it would be called after this lookup. But presumably it is only a few nanoseconds and wouldn't matter.

This might also entail moving the query rewriting functionality before the tracing. It's not clear whether the tracer should trace the SQL the calling application provided or the ultimate SQL sent to PostgreSQL.

My opinion on 1 is that it indeed doesn't matter, as long as the tradeoff is consistent for all the queries, which would be the case here.
For 2, there might be a slight advantage with tracing the query sent to Postgres, because it means you could theoretically cross reference metrics from pgx and metrics directly from Postgres. Or even metrics from other projects not using pgx but making the same queries.

Jack Christensen · Answer 3 · Wed Oct 11 2023 11:03:40 GMT+0800 (China Standard Time)

It's a good thing you brought up the broken statement issue. Somehow I'd totally forgotten that issue in only a week 🤦 .

The error is silent because the SQL text is still passed to the underlying pgx connection. It just runs as a normal query instead of a prepared query. And database/sql ignores the error produced by the underlying Deallocate call when the broken statements are released. So it was actually a little tricky to get a test that would actually reveal this behavior. But it was indeed broken.

I ended up adding the reference counting to the statements to resolve it. This also means that pgx should now behave as you requested in #1753.

As far as the actual tracing goes, I'm open to expanding the tracing information as described above, but if this resolves your issue then I'm going to let it wait until there is more demand.

Thibaut Rousseau · Answer 4 · Wed Oct 11 2023 18:16:47 GMT+0800 (China Standard Time)

perfect, thanks a lot :)

Thibaut Rousseau · Answer 5 · Thu Oct 12 2023 00:05:04 GMT+0800 (China Standard Time)

would you mind making a new tag with these changes? I don't think I can point my dependency to master without rewriting my import to github.com/jackc/pgx instead of github.com/jackc/pgx/v5

Jack Christensen · Answer 6 · Sun Oct 15 2023 07:02:05 GMT+0800 (China Standard Time)

There will probably be a new release in the next couple weeks.

But for now you can use a specific commit like this:

go get github.com/jackc/pgx/v5@304697de36f37f92f8ad7b6eeb5061ea66324d3d

Thibaut Rousseau · Answer 7 · Sun Oct 15 2023 19:26:52 GMT+0800 (China Standard Time)

It's working, thanks again.