[Question] pg_advisory_lock not supported by CockroachDB
jvanalstine opened this issue · comments
Greetings,
This is my favorite migration tool. However, I just started a new project with CockroachDB, and on migration I get a warning
Error initializing migrator:
ERROR: unknown function: pg_advisory_lock() (SQLSTATE 42883)
It looks like there are a couple places where the function is used in migrate.go. Is it your opinion that this lock function is critical to the library? It seems like cockroach always uses SERIALIZABLE isolation
.
The advisory lock is to protect against multiple migrations being run at the same time. tern would run fine without it, but concurrently running migrations would have unforeseen consequences.
#31 is somewhat related. The solution I proposed there was to allow user provided SQL for lock and unlock. With that in place you could use some other means of locking such as creating a table and checking its existence.
The solution I proposed there was to allow user provided SQL for lock and unlock.
I am new to cockroach, but I believe they will not support pg_advisory_lock anytime soon. So this solution would likely work for roach. From what I can tell, there are no locks in cockroach. Also, I am not able to find any recommendations with respect to concurrently applying schema migrations. It looks like it runs raft and is CP, so I think they can just be applied concurrently without much worry.
Tern was mentioned here btw
Also, I am not able to find any recommendations with respect to concurrently applying schema migrations. It looks like it runs raft and is CP, so I think they can just be applied concurrently without much worry.
What I mean is that two migration processes are run at the same time. You never would want that to happen. e.g. migration 1 is running twice at the same time. Now it is extremely likely that one would fail because adding or dropping a table or column can only be done once. But if it was a data migration the user might not be so lucky.
Practically speaking though, it is a very unlikely case that is being protected against. An optional flag to disable locking might be reasonable.
What I mean is that two migration processes are run at the same time. You never would want that to happen. e.g. migration 1 is running twice at the same time. Now it is extremely likely that one would fail because adding or dropping a table or column can only be done once. But if it was a data migration the user might not be so lucky.
Makes sense.
An optional flag to disable locking might be reasonable.
I think a feature flag here makes a lot of sense. Do you want to take a wack at it or should I?
It seems like there ought to be a flag called something like --no-lock
that cases this bit of code to be skipped and replaced with a regular transaction.
The Migrator could be lock aware in the same way that it is transaction aware. That is, change the MigratorOptions struct to
type MigratorOptions struct {
// DisableLock causes the Migrator to run migrations without pg_advisory_lock()
DisableLock bool
// DisableTx causes the Migrator not to run migrations in a transaction.
DisableTx bool
// MigratorFS is the interface used for collecting the migrations.
MigratorFS MigratorFS
}
And then change MigrateTo to
func (m *Migrator) MigrateTo(ctx context.Context, targetVersion int32) (err error) {
if !m.options.DisableLock {
err = acquireAdvisoryLock(ctx, m.conn)
if err != nil {
return err
}
defer func() {
unlockErr := releaseAdvisoryLock(ctx, m.conn)
if err == nil && unlockErr != nil {
err = unlockErr
}
}()
}
I added the flag here. Although, there is still a problem with multiple transactions trying to run on the same connection. It works if I set disable-tx = true
in the ini conf file.
ERROR: current transaction is aborted, commands ignored until end of transaction block (SQLSTATE 25P02)
Example Tern conf
[database]
# host is required (network host or path to Unix domain socket)
host = localhost
port = 9000
# database is required
database = platform
# user defaults to OS user
user = platform_admin
password =
disable-lock = true
# version_table = public.schema_version
#
# sslmode generally matches the behavior described in:
# http://www.postgresql.org/docs/9.4/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION
#
# There are only two modes that most users should use:
# prefer - on trusted networks where security is not required
# verify-full - require SSL connection
# sslmode = prefer
#
# sslrootcert is generally used with sslmode=verify-full
# sslrootcert = /path/to/root/ca
# Proxy the above database connection via SSH
# [ssh-tunnel]
# host =
# port = 22
# user defaults to OS user
# user =
# password is not required if using SSH agent authentication
# password =
[data]
# Any fields in the data section are available in migration templates
# prefix = foo
If you'd like a roach stack to play with
base-infra.yml:
version: "3.9"
services:
roach-0:
container_name: roach-0
image: cockroachdb/cockroach:latest
volumes:
- cockroach_data_0:/cockroach/cockroach-data
command: start --insecure --locality=datacenter=roach_net --join=roach-0,roach-1,roach-2
ports:
- "9000:26257"
- "8080:8080"
networks:
- roach_net
hostname: roach-0
roach-1:
container_name: roach-1
image: cockroachdb/cockroach:latest
volumes:
- cockroach_data_1:/cockroach/cockroach-data
command: start --insecure --locality=datacenter=roach_net --join=roach-0,roach-1,roach-2
networks:
- roach_net
hostname: roach-1
roach-2:
container_name: roach-2
image: cockroachdb/cockroach:latest
volumes:
- cockroach_data_2:/cockroach/cockroach-data
command: start --insecure --locality=datacenter=roach_net --join=roach-0,roach-1,roach-2
networks:
- roach_net
hostname: roach-2
volumes:
cockroach_data_0:
cockroach_data_1:
cockroach_data_2:
networks:
roach_net:
driver: bridge
immediately after running docker-compose -f base-infra.yml up
, run docker exec -it roach-0 /cockroach/cockroach init --insecure
That error indicates that transactions are still being used somewhere and an error occurs inside it.
Yes I figured that out thank you. Another thing that could be done is using the strategy that golang-migrate uses. If the url has postgresql:// as the protocol, then an advisory lock is applied like in this repo. If cockroach:// is the protocol, then a lock table is created
schema_name | table_name | type | owner | estimated_row_count | locality
--------------+-------------------+-------+----------------+---------------------+-----------
public | schema_lock | table | platform_admin | 0 | NULL
public | schema_migrations | table | platform_admin | 1 | NULL
Yeah, that might work. Though I'm a little reluctant to use lock tables. The nice thing about advisory locks is they automatically get cleaned up if the migration connection dies unexpectedly. Whereas the strategy golang-migrate uses can require manual cleanup in an error case. See https://github.com/golang-migrate/migrate/blob/aef2747761d2a59d87e36fa942d9ece5339a2c41/database/cockroachdb/cockroachdb.go#L199.
I'd love to move from migrate
to tern
but hit this too. Is the plan to wait for CockroachDB to make a stubbed out non-functional advisory lock implementation? cockroachdb/cockroach#13546
The only potential solutions that have come up on the tern side is an option to disable or customize the locking logic. I would lean to customizing the lock logic as that would also allow disabling it if desiring.
But I'm not currently working on this and I don't know anyone who is.
Would you be open to a change along these lines? Any idea what kind of customisation would you want to allow?
Would allowing users to opt into the go migrate style be acceptable?
@tmcnicol Which approach? Several have been mentioned on this thread.
Sorry I was referring to the go-migrate option, of a lock table that people could opt into. Was thinking that we could get around the unlocking when the connection unexpectedly dies by using a heartbeat mechanism.
@tmcnicol Yes, that seems reasonable.
@ryanluu12345 Not that I know of.