SQL Server row limitations for scripts

Question

SQL Server row limitations for scripts

opened this issue 9 years ago · comments

This script looks awesome. How would you support creating MERGE scripts for a SQL environment that has row limitations on scripts?

Ex: table has 5,000 elements but the SQL Server only allows scripts to run over 1,000 elements at a time. Therefore the generate-sql-merge takes a parameter that says 1000 and makes 5 MERGE statements.

I know I could just run the script and do the split by 1000 by hand, so if it's too much trouble let me know.

Andrew Savinykh · Answer 1 · Fri Jul 31 2015 03:49:26 GMT+0800 (China Standard Time)

Just out of curiosity, what kind of environment might have a limitation like this?

Deleted user · Answer 2 · Fri Jul 31 2015 04:42:09 GMT+0800 (China Standard Time)

It's a SQL Server on the client's side of things, I don't know much about it.

Andrew Savinykh · Answer 3 · Fri Jul 31 2015 15:18:52 GMT+0800 (China Standard Time)

Ah ok, so something completely non-standard/artificial. Thanks.

Daniel Shin · Answer 4 · Sat Aug 13 2016 02:38:51 GMT+0800 (China Standard Time)

I have come across a similar problem as well.
Our system does not have an explicit limit on the maximum, but they system ran out of resource before completing the script

I found that commenting out the following enables the script to successfully finish.

WHEN NOT MATCHED BY SOURCE THEN
DELETE

Simon The Seeker · Answer 5 · Mon Aug 15 2016 17:49:27 GMT+0800 (China Standard Time)

@dshin198, yes the script would take much more resource if you use WHEN NOT MATCHED BY SOURCE, I saw you created an issue mentioning 1.7m records, it will definitely not work. The query generated is purely putting things into memory table. I have a pull request that nobody seem to care provided a work around that will insert those source into a temp table first and then perform the merge, therefore removed the heavy resource usage on the merge itself. But I never tried a table with million records. In your extreme case I'm afraid MERGE itself won't work well for you, If you google around you can see all sort of problems about MERGE, performance is one of the problem.

Daniel Shin · Answer 6 · Fri Aug 19 2016 21:32:47 GMT+0800 (China Standard Time)

@simonwangu I see. It would be very nice to have your implementation as the default or at least an option available by using flag.
Thanks for your advice about the MERGE comment in general as well. I'll share your information with my team.

Daniel Nolan · Answer 7 · Thu Mar 28 2019 09:53:14 GMT+0800 (China Standard Time)

@deyshin wrote:

@simonwangu It would be very nice to have your implementation as the default or at least an option available by using flag.

The WHEN NOT MATCHED BY SOURCE THEN DELETE clause can be excluded by specifying the following parameter: @delete_if_not_matched = 0.

Given many users of the tool use it to generate scripts to synchronise static data as part of an automated deployment process, and therefore rely on it to perfectly match the content stored in source control, I would be hesitant to change this to be the default.

Perhaps there could be another way to get around the row limitation/"out of resources" issue, though: why not batch load or bulk insert the data into a temporary table first and then merge it from that? I can imagine that this would be a lot more efficient than generating the merge statement with all the data contained within a monolithic VALUES clause. I've covered this idea in more detail here: #19 (comment)

balazs HIDEGHETY · Answer 8 · Thu Jan 27 2022 17:26:29 GMT+0800 (China Standard Time)

@ghost see answer to #75 - it works, I used to generate a script for 40K rows

Daniel Nolan · Answer 9 · Mon Jul 31 2023 15:12:50 GMT+0800 (China Standard Time)

An update on this: @EitanBlumin has very helpfully implemented a new parameter that allows you to split source rows into multiple MERGE statements. To use, use @max_rows_per_batch=1000 or whatever batch size you need and be sure to also include the @delete_if_not_matched=0 param.