dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.

Home Page:https://docs.microsoft.com/ef/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Insert record sequentially in DB without identity on PK

Uraharadono opened this issue · comments

I asked this on SO as well, unfortunately I didn't get any number of notable responses:
https://stackoverflow.com/questions/69750232/ef-core-insert-record-sequentially-in-db-without-identity-on-pk

Disclaimer: This question is regarding "modernization" of almost 17 years old system. ORM that was used 17 years ago required that PK's don't use Identity.
I know how bad it is, and I can't change anything regarding this.

So I have (simplified) following table in database:

CREATE TABLE [dbo].[KlantDocument](
	[OID] [bigint] NOT NULL,
	[Datum] [datetime] NOT NULL,
	[Naam] [varchar](150) NOT NULL,
	[Uniek] [varchar](50) NOT NULL,
	[DocumentSoort] [varchar](50) NULL,
 CONSTRAINT [PK_KlantDocument] PRIMARY KEY CLUSTERED 
(
	[OID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[KlantDocument] CHECK CONSTRAINT [FK_KlantDocument_Klant]
GO

As you can see, table doesn't have Identity set on PK, so it has to be inserted manually.

Project is being rebuilt in Web Api .Net Core 5, that is going to do all of the CRUD operations.
It is using EF Core as ORM, and it is agreed that Unit of Work pattern is going to be used here (please do keep reading, UoW is not issue here).

For those curious or for what it's worth, you can take a look at it here (https://pastebin.com/58bSDkUZ) ( this is by no means full UoW, just partially and without comments).

Controller Action:

[HttpPost]
        [Route("UploadClientDocuments")]
        public async Task<IActionResult> UploadClientDocuments([FromForm] ClientDocumentViewModel model)
        {
            if (!ModelState.IsValid)
                return BadRequest(ModelStateExtensions.GetErrorMessage(ModelState));

            var dto = new ClientDocumentUploadDto
            {
                DocumentTypeId = model.DocumentTypeId,
                ClientId = model.ClientId,
                FileData = await model.File.GetBytes(),
                FileName = model.File.FileName, // I need this property as well, since I have only byte array in my "services" project
            };

            var result = await _documentsService.AddClientDocument(dto);
            return result ? Ok() : StatusCode(500);
        } 

When I am inserting record I am doing it like so:

// Called by POST method multiple times at once
public async Task<bool> AddClientDocument(ClientDocumentUploadDto dto)
{
    try
    {
        var doc = new KlantDocument
        {
		/* All of the logic below to fetch next id will be executed before any of the saves happen, thus we get "Duplicate key" exception. */
            // 1st try to fetch next id
            // Oid = _uow.Query<KlantDocument>().OrderByDescending(s => s.Oid).First().Oid + 1,
            // 2nd try to fetch next id
            // Oid = _uow.Query<KlantDocument>().OrderBy(s => s.Oid).Last().Oid + 1,
            // 3rd try to fetch next id
            // Oid = await _uow.Query<KlantDocument>().OrderBy(s => s.Oid).AsNoTracking().Select(s => s.Oid).LastAsync() + 1,
            // 4th try to fetch next id
            // Oid = _uow.Query<KlantDocument>().OrderBy(s => s.Oid).AsNoTracking().Select(s => s.Oid).Last() + 1,
            // 5th try to fetch next id
            // Oid = (_uow.Query<KlantDocument>().OrderBy(s => s.Oid).AsNoTracking().Max(s => s.Oid) + 1),
            Naam = dto.FileName,
            DocumentSoort = dto.DocumentTypeId,
            Datum = DateTime.Now,
            Uniek = Guid.NewGuid() + "." + dto.FileName.GetExtension()
        };

        _uow.Context.Set<KlantDocument>().Add(doc); // Does not work
        _uow.Commit();
    }
    catch (Exception e)
    {
        _logger.Error(e);
        return false;
    }
}

I get "Duplicate key" exception because 2 of the records are overlapping when inserting.

I have tried to wrap it into the transaction like so:

_uow.ExecuteInTransaction(() => { 
        var doc = new KlantDocument
        {
		/* All of the logic below to fetch next id will be executed before any of the saves happen, thus we get "Duplicate key" exception. */
            // 1st try to fetch next id
            // Oid = _uow.Query<KlantDocument>().OrderByDescending(s => s.Oid).First().Oid + 1,
            // 2nd try to fetch next id
            // Oid = _uow.Query<KlantDocument>().OrderBy(s => s.Oid).Last().Oid + 1,
            // 3rd try to fetch next id
            // Oid = await _uow.Query<KlantDocument>().OrderBy(s => s.Oid).AsNoTracking().Select(s => s.Oid).LastAsync() + 1,
            // 4th try to fetch next id
            // Oid = _uow.Query<KlantDocument>().OrderBy(s => s.Oid).AsNoTracking().Select(s => s.Oid).Last() + 1,
            // 5th try to fetch next id
            // Oid = (_uow.Query<KlantDocument>().OrderBy(s => s.Oid).AsNoTracking().Max(s => s.Oid) + 1),
            Naam = dto.FileName,
            DocumentSoort = dto.DocumentTypeId,
            Datum = DateTime.Now,
            Uniek = Guid.NewGuid() + "." + dto.FileName.GetExtension()
        };

        _uow.Context.Set<KlantDocument>().Add(doc); // Does not work
        _uow.Commit();
});

and it does not work. I still get "Duplicate key" exception.

From what I know, shouldn't EF by default lock database until transaction is complete?

I tried to manually write insert SQL like so:

using (var context = _uow.Context)
{
    using (var dbContextTransaction = context.Database.BeginTransaction())
    {
        try
        {
            // VALUES ((SELECT MAX(OID) + 1 FROM KlantDocument), -- tried this also, but was not yielding results
            var commandText = @"INSERT INTO KlantDocument
            (
             OID
            ,Datum
            ,Naam
            ,Uniek
            ,DocumentSoort
            )
            VALUES (
                    (SELECT TOP(1) (OID + 1) FROM KlantDocument ORDER BY OID DESC),
                    @Datum,
                    @Naam,
                    @Uniek,
                    @DocumentSoort
            )";
            var datum = new SqlParameter("@Datum", DateTime.Now);
            var naam = new SqlParameter("@Naam", dto.FileName);
            var uniek = new SqlParameter("@Uniek", Guid.NewGuid() + "." + dto.FileName.GetExtension());
            var documentSoort = new SqlParameter("@DocumentSoort", dto.DocumentTypeId ?? "OrderContents");

            context.Database.ExecuteSqlRaw(commandText, datum, naam, uniek, documentSoort);
            dbContextTransaction.Commit();
        }
        catch (Exception e)
        {
            dbContextTransaction.Rollback();
            return false;
        }
    }
}

Same thing.

I did a lot research to try to tackle this issue:

  1. https://stackoverflow.com/questions/65234534/net-core-executesqlraw-last-inserted-id - solutions either don't work, or work only with identity columns
  2. I got idea to use SQL OUTPUT variable, but issue is that it's really hard or even impossible to achieve, feels hacky and overall if managed, there is no guarantee that it will work. As seen here: https://stackoverflow.com/questions/5558582/sql-server-output-clause-into-a-scalar-variable and here https://stackoverflow.com/questions/12566493/how-can-i-assign-inserted-output-value-to-a-variable-in-sql-server/12566529

As noted above, ORM at the time did some magic where it did sequentially insert records in database and I would like to keep it that way.

Is there any way that I can achieve sequential insert when dealing with given scenario?

shouldn't EF by default lock database until transaction is complete

No - databases definitely aren't locked while transactions are in progress. If two of your transactions are running in parallel, they may get the same "next ID", and one will fail.

The proper tool for this is likely a database sequence - have you considered this? Sequences guarantees unique returned values via atomic operations in the database. You would use simple raw SQL to get the next value from the sequence (SELECT NEXT VALUE FOR <seqname>), and then use that. This is still pretty bad for perf (two roundtrips for each insert), but it would work.

Otherwise I believe you can somehow configure for the whole table to be locked, but I'd really recommend avoiding that unless absolutely necessary.

The proper tool for this is likely a database sequence - have you considered this? Sequences guarantees unique returned values via atomic operations in the database. You would use simple raw SQL to get the next value from the sequence (SELECT NEXT VALUE FOR <seqname>), and then use that. This is still pretty bad for perf (two roundtrips for each insert), but it would work.

@roji
Wow thank you a lot for that detailed of an answer. To be honest, and I am sorry for this, I have never heard for database sequence , or I simply am confused.

Currently thinking about these 3 things as sequences:

(SELECT TOP(1) (OID + 1) FROM KlantDocument ORDER BY OID DESC),
or
(SELECT MAX(OID) + 1 FROM KlantDocument)

and I am already doing it inside my raw SQL code.

Sorry for this kind of a strange question, I am just confused a bit. I haven't dabbed this much into DB work until this issue arose.

Sequence would be Identity right?

No - although they're somewhat similar, sequences and SQL Server IDENTITY columns are two different things. Sequences are independent database objects which you create with CREATE SEQUENCE, and from that point you can get out unique numbers from them.

You are telling me to fetch next id directly like so:

No, you'd be fetching the next ID by doing SELECT NEXT VALUE FOR <seqname>.

BTW, above I was assuming that you cannot change your database schema in any way. If you can add default values to your ID columns, then you can make them populate automatically for the sequence; you would no longer send the next ID yourself, and since it's missing, the column's default value would executethe SELECT NEXT VALUE ... automatically. But if you can't touch your database schema, you can still execute SELECT NEXT VALUE ... yourself.

@roji Thank you a lot Shay (if I may refer to you as that). You helped me a lot.
I will go and read everything I can about and regarding this approach, and try it out.
I will get back once I succeeded or failed.

Thanks a lot for pushing me in a right direction with this.

@roji Here is update. I managed to make it work. I want to thank you once again million times for helping me deal with this issue.

What didn't work:

I followed this documentation: https://docs.microsoft.com/en-us/ef/core/modeling/sequences, I was ecstatic to find this could actually be "automated" (read, I could just add them to context and commit changes using my UoW) and I won't be having the need to write the SQL queries when inserting data all the time.
Thus I created following code to setup the sequence in DbContext and to run it using migrations.

My OnModelCreating method changes looked like so:

            modelBuilder.HasSequence<long>("KlantDocumentSeq")
                .StartsAt(2000)
                .IncrementsBy(1)
                .HasMin(2000);
				
            modelBuilder.Entity<KlantDocument>()
                .Property(o => o.Oid)
                .HasDefaultValueSql("NEXT VALUE FOR KlantDocumentSeq");

But since my Oid (PK) is not nullable, whenever I tried to insert my document like so:

                var doc = new KlantDocument
                {
                    Naam = dto.FileName,
                    DocumentSoort = dto.DocumentTypeId,
                    Datum = DateTime.Now,
                    Uniek = Guid.NewGuid() + "." + dto.FileName.GetExtension()
                };
                _uow.Context.Set<KlantDocument>().Add(doc);
                _uow.Commit();

It produced this SQL (beautified for visibility purposes):

SET NOCOUNT ON;
INSERT INTO [KlantDocument] ([OID], [Datum], [DocumentSoort], [Naam], [Uniek])
VALUES (0, '2021-10-29 14:34:06.603', NULL, 'sample 1 - Copy (19).pdf', '56311d00-4d7c-497c-a53e-92330f9f78d4.pdf');

And naturally I got exception:
InnerException = {"Violation of PRIMARY KEY constraint 'PK_KlantDocument'. Cannot insert duplicate key in object 'dbo.KlantDocument'. The duplicate key value is (0).\r\nThe statement has been terminated."}

Since it is default value of long (non-nullable) OID value to 0, when creating doc variable, as it is not nullable.

NOTE: Be very very cautious if you are using this approach even if you can. It will alter all of your Stored procedures where this OID is used.
In my case it created all of these:

 SELECT * FROM sys.default_constraints
 WHERE 
	name like '%DF__ArtikelDocu__OID__34E9A0B9%' OR
	name like '%DF__KlantDocume__OID__33F57C80%' OR
	name like '%DF__OrderDocume__OID__33015847%'

and I had to manually ALTER each table to drop these constrains.
@roji Is this by design? Should I report this as bug ? Literally I just created given migration, and I had to manually delete these constrains to roll it back.

What worked:

Again, thanks to a really awesome human being from EF Core team, Roji, I managed to implement this to work.
Same as above, I created change in DbContext, OnModelCreating method:

        modelBuilder.HasSequence<long>("KlantDocumentSeq")
            .StartsAt(2000)
            .IncrementsBy(1)
            .HasMin(2000);

Added migration that produced this code:

    protected override void Up(MigrationBuilder migrationBuilder)
    {
        migrationBuilder.CreateSequence(
            name: "KlantDocumentSeq",
            startValue: 2000L,
            minValue: 2000L);
    }

    protected override void Down(MigrationBuilder migrationBuilder)
    {
        migrationBuilder.DropSequence(
            name: "KlantDocumentSeq");
    }

Updated database, and for DB work that was it.
Final change came into my method where I had to replace notorious code block
((SELECT MAX(OID) + 1 FROM KlantDocument) or (SELECT TOP(1) (OID + 1) FROM KlantDocument ORDER BY OID DESC)

with:
NEXT VALUE FOR KlantDocumentSeq
and it worked.

Full code of my method is here:

using (var context = _uow.Context)
{
    using (var dbContextTransaction = context.Database.BeginTransaction())
    {
        try
        {
            var commandText = @"INSERT INTO KlantDocument
            (
             OID
            ,Datum
            ,Naam
            ,Uniek
            ,DocumentSoort
            )
            VALUES (
                    NEXT VALUE FOR KlantDocumentSeq,
                    @Datum,
                    @Naam,
                    @Uniek,
                    @DocumentSoort
            )";
            var datum = new SqlParameter("@Datum", DateTime.Now);
            var naam = new SqlParameter("@Naam", dto.FileName);
            var uniek = new SqlParameter("@Uniek", Guid.NewGuid() + "." + dto.FileName.GetExtension());
            var documentSoort = new SqlParameter("@DocumentSoort", dto.DocumentTypeId ?? "OrderContents");

            context.Database.ExecuteSqlRaw(commandText, datum, naam, uniek, documentSoort);
            dbContextTransaction.Commit();
        }
        catch (Exception e)
        {
            dbContextTransaction.Rollback();
            return false;
        }
    }
}

and SQL it generates is:

INSERT INTO KlantDocument 
(
 OID
,Datum
,Naam
,Uniek
,DocumentSoort
)
VALUES (
        NEXT VALUE FOR KlantDocumentSeq,
        '2021-10-29 15:34:43.943',
        'sample 1 - Copy (13).pdf',
        'd4419c6c-dff9-431c-a7ed-6db54407051d.pdf',
        'OrderContents'
)

@Uraharadono when you configure a property with HasDefaultValueSql, and you don't explicitly set a value when inserting, then EF Core should not be sending a value to the database (the zero above), letting the configured database default happen. I'm not sure exactly what's going on in your code, but see the following minimal sample:

Program code
await using var ctx = new BlogContext();
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();

ctx.Blogs.Add(new() { Name = "foo" });
await ctx.SaveChangesAsync();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.HasSequence<long>("KlantDocumentSeq")
            .StartsAt(2000)
            .IncrementsBy(1)
            .HasMin(2000);

        modelBuilder.Entity<Blog>().Property(b => b.Id).HasDefaultValueSql("NEXT VALUE FOR KlantDocumentSeq");
    }
}

public class Blog
{
    public int Id { get; set; }
    public string Name { get; set; }
}

Running the above produces the following INSERT SQL:

SET NOCOUNT ON;
DECLARE @inserted0 TABLE ([Id] int);
INSERT INTO [Blogs] ([Name])
OUTPUT INSERTED.[Id]
INTO @inserted0
VALUES (@p0);
SELECT [i].[Id] FROM @inserted0 i;

As you can see, Id isn't being inserted by EF Core, but rather being read back (after the database generates the value from the sequence etc).

Check your exact code to see what is different, and if you're still having issues getting this to work, please try posting a minimal code sample (like the above - you can even tweak it) to reproduce the problematic behavior.

Hello @roji,
Sorry for late reply. I was confused as hell why is this happening, then I took a bit time to rest over the weekend. Today I created POCO solution which replicates this behavior.
You can find it here: https://github.com/Uraharadono/RojiSequencePoco
(Edit: I took DB "schema" from https://docs.microsoft.com/en-us/ef/core/get-started/overview/first-app?tabs=netcore-cli)

Setup is as follows:

  1. Create local DB with name BloggingDb
  2. Package Manager Console -> Update database
  3. Run project

I have setup 3 ways to try to insert new Blog record, and every one of 3 fails:

  1. Using Unit of work context
                var ctx = _uow.Context as Net5BoilerplateContext;
                ctx.Blogs.Add(new() { Url = dto.Url });
                ctx.SaveChanges(); // doesn't work, either

produces:

exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [Blogs] ([Oid], [Url])
VALUES (@p0, @p1);
',N'@p0 bigint,@p1 nvarchar(4000)',@p0=0,@p1=N'AddBlogUsingUoWContext'

  1. Using manually created context:
                await using var ctx = Net5BoilerplateContext.Create("Server=.;Database=BloggingDb;Trusted_Connection=True;MultipleActiveResultSets=True;", 1);
                // ctx.Database.EnsureDeletedAsync();
                // await ctx.Database.EnsureCreatedAsync();

                ctx.Blogs.Add(new() { Url = dto.Url });
                await ctx.SaveChangesAsync();

produces:

exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [Blogs] ([Oid], [Url])
VALUES (@p0, @p1);
',N'@p0 bigint,@p1 nvarchar(4000)',@p0=0,@p1=N'AddBlogUsingProperContext'
  1. Using my Unit of Work:
                var dbBlog = new DbContext.Entities.Blog
                {
                    Url = dto.Url,
                };
                _uow.Context.Set<DbContext.Entities.Blog>().Add(dbBlog);
                _uow.Commit();

produces:

exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [Blogs] ([Oid], [Url])
VALUES (@p0, @p1);
',N'@p0 bigint,@p1 nvarchar(4000)',@p0=0,@p1=N'AddBlogUsingUoW'

I don't know what the deal is, as in 2nd migration there is defaultValueSql: "NEXT VALUE FOR BlogSeq",.
If I create both Sequence and set default value of the Blog PK to sequence in single migration, it works.

@Uraharadono you have [DatabaseGenerated(DatabaseGeneratedOption.None)] on the Blog.Oid property. Although the database column indeed has the proper default value SQL (NEXT VALUE FOR BlogSeq), DatabaseGeneratedOption.None tells EF Core that there is no value generation configured. This makes EF Core always send whatever value is contained in the Blog.Oid, which is zero if it's unset. If you remove this attribute, things should start working.

@roji Thanks a lot for coming back to me once again. Can I just ask another thing:
Is this same behavior if we have ValueGeneratedNever using fluent api?

Or to put more perspective and sense into this whole mess of a question.

My code first entities are created using EF Scaffold command on existing (17 y.o. DB), and it generated this code for this table:

modelBuilder.Entity<KlantDocument>(entity =>
            {
                entity.HasKey(e => e.Oid);

                entity.ToTable("KlantDocument");

                entity.Property(e => e.Oid)
                    .ValueGeneratedNever()
                    .HasColumnName("OID");

                entity.Property(e => e.Datum).HasColumnType("datetime");

                entity.Property(e => e.DocumentSoort)
                    .HasMaxLength(50)
                    .IsUnicode(false);

                entity.Property(e => e.Naam)
                    .IsRequired()
                    .HasMaxLength(150)
                    .IsUnicode(false);

                entity.Property(e => e.Uniek)
                    .IsRequired()
                    .HasMaxLength(50)
                    .IsUnicode(false);

                entity.HasOne(d => d.GebruikerNavigation)
                    .WithMany(p => p.KlantDocuments)
                    .HasForeignKey(d => d.Gebruiker)
                    .OnDelete(DeleteBehavior.ClientSetNull)
                    .HasConstraintName("FK_KlantDocument_Gebruiker");

                entity.HasOne(d => d.KlantNavigation)
                    .WithMany(p => p.KlantDocuments)
                    .HasForeignKey(d => d.Klant)
                    .OnDelete(DeleteBehavior.ClientSetNull)
                    .HasConstraintName("FK_KlantDocument_Klant");
            });

So it has automatically generated .ValueGeneratedNever() part.

Is this same behavior if we have ValueGeneratedNever using fluent api?

Yes - you probably want to take a look at our docs, a lot of this stuff is covered there.

My code first entities are created using EF Scaffold command on existing (17 y.o. DB), and it generated this code for this table:

I assume you're asking why the ValueGeneratedNever is in the scaffolded model... So I'm guessing that the database which was scaffolded simply did not have IDENTITY columns or default SQL (you added that later). Since Oid was the key, and by convention integer keys are IDENTITY, the scaffolding set ValueGeneratedNever to make your model actually match your database. Basically the model matches the database at the point where it was scaffolded, but you're now making adjustments to it.