buildbuddy-io / buildbuddy

BuildBuddy is an open source Bazel build event viewer, result store, remote cache, and remote build execution platform.

Home Page:https://buildbuddy.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`ERROR: current transaction is aborted, commands ignored until end of transaction block (SQLSTATE 25P02)`

nhurden opened this issue · comments

Follow up to #4349: after updating to v2.16.0, the error is now:

[2023-08-03T05:13:10Z] ERROR: The Build Event Protocol upload failed: Not retrying publishBuildEvents, no more attempts left: status='Status{code=UNKNOWN, description=ERROR: current transaction is aborted, commands ignored until end of transaction block (SQLSTATE 25P02), cause=null}' UNKNOWN: UNKNOWN: ERROR: current transaction is aborted, commands ignored until end of transaction block (SQLSTATE 25P02) UNKNOWN: UNKNOWN: ERROR: current transaction is aborted, commands ignored until end of transaction block (SQLSTATE 25P02)

Seems to be related to the same duplicate key issue - postgresql.log:

[21872]:ERROR: duplicate key value violates unique constraint "Invocations_pkey"
[21872]:DETAIL: Key (invocation_id)=(2de0d8d3-71ad-4b6a-973f-111a1edad420) already exists.
[21872]:STATEMENT: INSERT INTO "Invocations" ("role","invocation_id","user_id","group_id","blob_id","pattern","user","command","host","repo_url","commit_sha","last_chunk_id","branch_name","created_at_usec","updated_at_usec","duration_usec","upload_throughput_bytes_per_second","action_count","created_with_capabilities","invocation_status","action_cache_hits","action_cache_misses","action_cache_uploads","cas_cache_hits","cas_cache_misses","cas_cache_uploads","total_download_size_bytes","total_upload_size_bytes","total_download_transferred_size_bytes","total_upload_transferred_size_bytes","total_download_usec","total_upload_usec","total_cached_action_exec_usec","total_uncached_action_exec_usec","download_throughput_bytes_per_second","success","attempt","bazel_exit_code","download_outputs_option","upload_local_results_enabled","remote_execution_enabled","tags","perms","redaction_flags","invocation_uuid") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45) RETURNING "perms","redaction_flags","invocation_uuid"
[21872]:ERROR: current transaction is aborted, commands ignored until end of transaction block
[21872]:STATEMENT:
SELECT attempt FROM "Invocations"
WHERE invocation_id = $1 AND invocation_status <> $2 AND updated_at_usec > $3

(v2.16.0 with PostgreSQL)

Hey @nhurden

Is it possible that you are doing multiple Bazel builds with the same Invocation ID? Usually, Invocation ID is a UUID and is unique to each build, but we have seen folks, who are overriding Bazel generated UUID with --invocation_id, to potentially re-use the old ID when their CI retry and have this issue 🤔

In our database, the table Invocations uses invocation_id as the primary key, so no insert duplication is allowed.

Hey @nhurden

Is it possible that you are doing multiple Bazel builds with the same Invocation ID? Usually, Invocation ID is a UUID and is unique to each build, but we have seen folks, who are overriding Bazel generated UUID with --invocation_id, to potentially re-use the old ID when their CI retry and have this issue 🤔

In our database, the table Invocations uses invocation_id as the primary key, so no insert duplication is allowed.

This is without overriding the invocation ID. The invocation that raised this error successfully reported some events but was disconnected with the above error during the same invocation (FWIW it's a bazel run).

cc: @tempoz

Reading the error + relevant code more carefully, I think the current SQLState() being returned is InFailedSQLTransaction and not UniqueViolation. Failure is still happening at the insert step and the SELECT statement was never run.

@nhurden thanks for reporting this. We will look into fixing this 🙇

Should be addressed now; feel free to re-open if you see this recur.