dnlnln / generate-sql-merge

Generate SQL MERGE statements with Table data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SQL Server row limitations for scripts

opened this issue · comments

This script looks awesome. How would you support creating MERGE scripts for a SQL environment that has row limitations on scripts?

Ex: table has 5,000 elements but the SQL Server only allows scripts to run over 1,000 elements at a time. Therefore the generate-sql-merge takes a parameter that says 1000 and makes 5 MERGE statements.

I know I could just run the script and do the split by 1000 by hand, so if it's too much trouble let me know.

Just out of curiosity, what kind of environment might have a limitation like this?

It's a SQL Server on the client's side of things, I don't know much about it.

Ah ok, so something completely non-standard/artificial. Thanks.

I have come across a similar problem as well.
Our system does not have an explicit limit on the maximum, but they system ran out of resource before completing the script

I found that commenting out the following enables the script to successfully finish.

WHEN NOT MATCHED BY SOURCE THEN
DELETE

@dshin198, yes the script would take much more resource if you use WHEN NOT MATCHED BY SOURCE, I saw you created an issue mentioning 1.7m records, it will definitely not work. The query generated is purely putting things into memory table. I have a pull request that nobody seem to care provided a work around that will insert those source into a temp table first and then perform the merge, therefore removed the heavy resource usage on the merge itself. But I never tried a table with million records. In your extreme case I'm afraid MERGE itself won't work well for you, If you google around you can see all sort of problems about MERGE, performance is one of the problem.

@simonwangu I see. It would be very nice to have your implementation as the default or at least an option available by using flag.
Thanks for your advice about the MERGE comment in general as well. I'll share your information with my team.

@deyshin wrote:

@simonwangu It would be very nice to have your implementation as the default or at least an option available by using flag.

The WHEN NOT MATCHED BY SOURCE THEN DELETE clause can be excluded by specifying the following parameter: @delete_if_not_matched = 0.

Given many users of the tool use it to generate scripts to synchronise static data as part of an automated deployment process, and therefore rely on it to perfectly match the content stored in source control, I would be hesitant to change this to be the default.

Perhaps there could be another way to get around the row limitation/"out of resources" issue, though: why not batch load or bulk insert the data into a temporary table first and then merge it from that? I can imagine that this would be a lot more efficient than generating the merge statement with all the data contained within a monolithic VALUES clause. I've covered this idea in more detail here: #19 (comment)

@ghost see answer to #75 - it works, I used to generate a script for 40K rows

An update on this: @EitanBlumin has very helpfully implemented a new parameter that allows you to split source rows into multiple MERGE statements. To use, use @max_rows_per_batch=1000 or whatever batch size you need and be sure to also include the @delete_if_not_matched=0 param.