sqlalchemy / alembic

A database migrations tool for SQLAlchemy.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

autogenerate with compare_server_default=True and reflecting table randomly fails to set conn_col_default

cpoppema opened this issue · comments

Describe the bug

I have compare_server_default=True and am running alembic revision --autogenerate. I am changing primary key column from int to uuid and have compare_server_default enabled to help with server_default=func.gen_random_uuid(). Sometimes autogenerate recognizes the existing pk sequences, sometimes it does not, leading to a new migration file that's empty, or sometimes contains a change applying server_default=None. Using postgres 15.x

Expected behavior
I'm still new to postgres/alembic/sqlalchemy, so I don't know what the desired behavior is. However, I would expect a deterministic result.

To Reproduce

When I am running autogenerate this is basically my models.py:

@as_declarative()
class BaseModel:
    metadata = MetaData(
        naming_convention={
            "ix": "ix_%(column_0_label)s",
            "uq": "uq_%(table_name)s_%(column_0_name)s",
            "ck": "ck_%(table_name)s_`%(constraint_name)s`",
            "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
            "pk": "pk_%(table_name)s",
        }
    )

    id = Column(UUID(as_uuid=True), primary_key=True, index=True, unique=True, server_default=func.gen_random_uuid())

    @declared_attr
    def pk(self):
        return deferred(Column(Integer().evaluates_none(), primary_key=False, index=True, nullable=True))


class AssociationModel(BaseModel):
    __tablename__ = "association_table"
    __table_args__ = (UniqueConstraint("parent_id", "child_id"),)

    parent_id = Column(ForeignKey("parent.id", ondelete="CASCADE"))
    child_id = Column(ForeignKey("child.id", ondelete="CASCADE"))


class ParentModel(BaseModel):
    __tablename__ = "parent"

    child_set = relationship(
        "ChildModel",
        secondary="association_table",
        back_populates="parent_set",
        cascade="all, delete",
    )


class ChildModel(BaseModel):
    __tablename__ = "child"

    parent_set = relationship(
        "ParentModel",
        secondary="association_table",
        back_populates="child_set",
    )

And the database has these defaults for the tables:

                                              Table "public.parent"
          Column           |           Type           | Collation | Nullable |               Default                
---------------------------+--------------------------+-----------+----------+--------------------------------------
 pk                        | integer                  |           |          | nextval('parent_pk_seq'::regclass)
 id                        | uuid                     |           | not null | gen_random_uuid()

                                              Table "public.child"
          Column           |           Type           | Collation | Nullable |               Default                
---------------------------+--------------------------+-----------+----------+--------------------------------------
 pk                        | integer                  |           |          | nextval('child_pk_seq'::regclass)
 id                        | uuid                     |           | not null | gen_random_uuid()

Error

autogenerate creates a file with upgrade/downgrade function that contain either 0, 1, or 2 alter_columns:

def upgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.alter_column('parent', 'pk',
               existing_type=sa.INTEGER(),
               server_default=None,
               existing_nullable=True)
    op.alter_column('child', 'pk',
               existing_type=sa.INTEGER(),
               server_default=None,
               existing_nullable=True)
    # ### end Alembic commands ###


def downgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.alter_column('child', 'pk',
               existing_type=sa.INTEGER(),
               server_default=sa.text("nextval('child_pk_seq'::regclass)"),
               existing_nullable=True)
    op.alter_column('parent', 'pk',
               existing_type=sa.INTEGER(),
               server_default=sa.text("nextval('parent_pk_seq'::regclass)"),
               existing_nullable=True)
    # ### end Alembic commands ###

For no operations, the log looks like this:

INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.ddl.postgresql] Detected sequence named 'parent_pk_seq' as owned by integer column 'parent(pk)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'child_pk_seq' as owned by integer column 'child(pk)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'association_table_pk_seq' as owned by integer column 'association_table(pk)', assuming SERIAL and omitting

For 1 operation it can look like this:

INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.ddl.postgresql] Detected sequence named 'parent_pk_seq' as owned by integer column 'parent(pk)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'association_table_pk_seq' as owned by integer column 'association_table(pk)', assuming SERIAL and omitting
INFO  [alembic.autogenerate.compare] Detected server default on column 'child.pk'

Versions.

  • OS: debian
  • Python: 3.11
  • Alembic: 1.13.1
  • SQLAlchemy: 2.0.29
  • Database: postgresql 15.4
  • DBAPI: asyncpg 0.29.0

Additional context

The order in which the tables are reflected seems to matter and is random. The alter_column operations are added when association_table is reflected before the other(s). I played around with the resolve_fks: bool = True-argument in the functionreflect_table: with this boolean set to False, the output is consistent no matter the other the tables are reflected in. The log output can end up being:

INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.ddl.postgresql] Detected sequence named 'child_pk_seq' as owned by integer column 'child(pk)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'association_table_pk_seq' as owned by integer column 'association_table(pk)', assuming SERIAL and omitting
INFO  [alembic.ddl.postgresql] Detected sequence named 'parent_pk_seq' as owned by integer column 'parent(pk)', assuming SERIAL and omitting

Although association_table is reflected before parent here, it does not result in an alter_column operation in the migration file.

When I looked at locals() inside autogenerate/compare.py::_compare_server_default, when child and/or parent was reflected thanks to resolve_fks=True, the variable conn_col_default had a value of DefaultClause(<sqlalchemy.sql.elements.TextClause object at 0x..........>, for_update=False) which holds the string nextval('child_pk_seq'::regclass) (or parent_pk_seq of course). When association_table is reflected last, or with resolve_fks=False, the variable conn_col_default is always None.

Hi,

Can you also post the previous status of your models before you updated them?

The issue here seems that alembic does some assumptions regarding what's a serial and what is not. That on itself should be fine, but not that the behaviour changes randomly

Of course, that should've been this:

@as_declarative()
class BaseModel:
    pk = Column(Integer, primary_key=True, index=True)


class IdColumnMixin:
    @declared_attr
    def id(cls):
        return Column(UUID(as_uuid=True), unique=True, nullable=False, index=True)


class AssociationModel(BaseModel):
    __tablename__ = "association_table"
    __table_args__ = (UniqueConstraint("parent_id", "child_id"),)

    parent_id = Column(ForeignKey("parent.id", ondelete="CASCADE"))
    child_id = Column(ForeignKey("child.id", ondelete="CASCADE"))


class ParentModel(IdColumnMixin, BaseModel):
    __tablename__ = "parent"

    child_set = relationship(
        "ChildModel",
        secondary="association_table",
        back_populates="parent_set",
        cascade="all, delete",
    )


class ChildModel(IdColumnMixin, BaseModel):
    __tablename__ = "child"

    parent_set = relationship(
        "ParentModel",
        secondary="association_table",
        back_populates="child_set",
    )