When initializing do I need to drop FKs at the subscriber?

This is an interesting question. In the publication properties there is the option on the snapshot tab to run extra scripts before and after the snapshot is applied. This is the same for both merge and transactional publications (below).

Many DBAs will have scripts which drop all the FKs on the subscriber and readd them after the snapshot is applied so the initialization runs smoothly and we don’t get the following sort of error:

Could not drop object ‘dbo.tCity’ because it is referenced by a FOREIGN KEY constraint“.

However the snapshot generation is different between Transactional and Merge. In Transactional all the FKs are dropped for you and re-added later on. This doesn’t happen for merge. There’s probably a good reason for it but can’t see why they should run differently at present. Anyway, the message is that you don’t need to roll your own logic to deal with subscriber FKs in Transactional but you still do in merge!

How to see all pending merge changes

There is a handy system stored procedure sp_showpendingchanges that will provide an “approximation” of the pending changes that need to be replicated at that database. This proc will list inserts/updates/deletes per article and can also filter by the subscriber server. It has been around since SQL 2008 and is a welcome change. Would be nicer if it was represented in the replication monitor like the one for transactional replication though.

You can run the procedure without any arguments in which case it provides a summary.

You can also provide all arguments as below to look at a specific subscriber and table and see the rows waiting to be sent.

exec sp_showpendingchanges @destination_server = ‘DESKTOP-NSONKMO’
, @publication = ‘Testuse_partition_groups’
, @article = ‘tCompany’
, @show_rows = 1

However, if the subscriber database name is the same as the publisher one nothing will get returned. Not great as this is a common setup!

In this case you’ll need to roll your own script using something like the code I created below:

select tCompany.* from tCompany
inner join MSmerge_contents on tCompany.rowguid = MSmerge_contents.rowguid
inner join MSmerge_genhistory on MSmerge_contents.generation = MSmerge_genhistory.generation
where MSmerge_genhistory.genstatus = 0 and MSmerge_genhistory.generation >
(select recgen from sysmergesubscriptions where subscriber_server = ‘DESKTOP-NSONKMO\paulsinst’)

PreComputePartitions merge option should come with health warning!

In merge replication we often replicate several related tables. Imagine if we were replicating the following 3 tables:

CompanyStaff

Now it might be that we declare a filter and join filters as follows in the publication:

CompanyStaffFilter

Notice that the relationship of the filters is opposite to the direction of the foreign keys. This makes business sense in some cases (another post on this later). Anyway, if this is the sort of scenario you have then beware that some data changes won’t propagate. If for example on the subscriber I delete 3 related records:

delete tStaff where staffname = ‘PaulSmith’
delete tDepartment where departmentname = ‘Finance’
delete tCompany where where CompanyName = ‘IBM’

When synchronizing back to the publisher the tCompany record will not be deleted. No error in the replication monitor but there is mention of a “retry”. Further syncs don’t mention the retry and still the tCompany record remains. This is a bug/issue that has existed from SQL 2005 through to SQL 2016. What we need to do is reset the Precompute partitions option. By default it is set to true and we reset it to false. This will cause a snapshot to be created so if you are going to have this type of filter setup remember to do this at the beginning before initialization.

Fixing missing merge rows

Suppose there are some rows at the publisher which are not at the subscriber. Or vice versa. This is after a successful synchronization….

Other posts have dealt with how this can occur, and how we can stop it happening, but suppose it has happened – what to do?

Well there is a nice stored procedure which comes to our rescue: sp_mergedummyupdate. We just need the name of the table and the rowguid.

exec sp_mergedummyupdate @source_object = ‘tCity’, @rowguid = ‘724FEE04-F8DB-E611-9BE5-080027A1E9BF’  

It is very robust – if we give it the rowguid of a non-missing row there will just be an update propagated and we won’t end up with duplicates. Likewise if we try running it more than once there won’t be an issue.

Just run the proc at the publisher or the subscriber (whichever one has the missing rows) and then synchronize – job done!

 

Missing merge data! Why? Bulk Inserts!

We need to know why some data is missing at the subscriber. This is after synchronization and no errors have been reported. One thing to check is see if someone has run a BULK INSERT statement.

For example consider the one below.

It inserts data into a merge-replicated table and looks innocent enough:

BULK INSERT testmergedb..tCity
FROM ‘C:\Snapshots\SSISMerge\Cities\Cities.dat’
WITH (FORMATFILE = ‘C:\Snapshots\SSISMerge\Cities\cities.fmt’);

However if I run the following to see what is waiting to go to the subscriber I see that there are no rows ready!

exec sp_showpendingchanges

PendingInserts1

 

By default the BULK INSERT statement doesn’t fire triggers and remember that merge replication adds insert/update/delete triggers to replicated tables in order to log all changes to them, so if the triggers are not fired merge doesn’t know of the change. There is an additional parameter we need to make sure the developers use: “FIRE_TRIGGERS” as below.

BULK INSERT testmergedb..tCity
FROM ‘C:\Snapshots\SSISMerge\Cities\Cities.dat’
WITH (FORMATFILE = ‘C:\Snapshots\SSISMerge\Cities\cities.fmt’, FIRE_TRIGGERS);

Now when we check the pending changes we see it there and it’ll go to the subscriber now.

exec sp_showpendingchanges

PendingInserts2

I’ll do a separate post to explain how to fix this type of issue if it has already happened!

 

Why I don’t use a continuous schedule!

In replication there will always be a delay – post SQL 2000 it is never using a distributed transaction – so the choice for the agent job is a continuous schedule or a repeating one. I always use a repeating schedule for my jobs. If I need them to run quickly this could be every 5 mins. One reason is the problematic accumulation of history records that are not removed.

Have a look below. I’m checking for the oldest record in the msmerge_history table and it is 1st Jan 2017. However manually running the cleanup procedure to remove all history older than 2 days shows 0 records deleted!!!

So – the history records keep accumulating in the distribution database.

This is because the filter in the cleanup proc to remove records looks at the datetime in the session history. As the merge agent runs in one session it continually updates the session time. So for the cleanup query it seems that the history records have always all just been created. Really this is a logic bug in the cleanup proc but since it has always been there either you let the history table get huge or you need to restart the merge agent. Remembering to do this is a pain – so that’s one reason why I use a repeating schedule!

Missing merge data! Why? Fast Loads!

When looking at the subscriber we may find that there are rows missing, although synchronization works without errors. I mentioned BULK INSERTS as one possible cause. Here we look at another – “Fast Loads”.

If the developers are using SSIS to insert data into a merge-replicated table, open their SSIS pakages and have a look at the data flow task:

SSISOLEDB

In particular look at the properties of the OLEDB Destination:

 

The default Data access mode is to perform a “Table or view – fast load”. This works incredibly quickly but it doesn’t fire table triggers and merge needs triggers to know what data to replicate. So – tell the devs to use “Table or view” and data will get replicated just fine :). I’ll do another post to explain what to do if the data is already messed up and you need to repair it.