Scheduling Validations

In merge we want to check that the data is in sync. This is done by setting up a validation:

Checking the rowcounts and the checksums is the only surefire way to get a comparison:

Nothing happens until we run the merge agent and synchronize. At that point if the data is out of sync the job fails and we’ll get an error in the merge agent history.

The best thing to do is to have a separate job which regularly triggers the validation. I have one which runs each Monday morning and has the following type of text in it:

exec sp_validatemergepublication @publication= ‘pubTestMerge’, ¬†@level = 3

As a final point if the merge agent is run again afterwards, there is a successful sync and no mention of the invalid data.

So – perhaps the support team re-run the agent and the important info is lost. This sort of thing is done quite often because the merge agent might fail because of an innocuous network issue. Just something to be aware of.

Understanding Identity Ranges – when do they fill up?

For a table which is merge replicated and has an identity column what happens – how does it all work? Looking at the article properties we can see the range sizes allocated to the publisher and subscriber. Each one will get an allocated range with a different starting point to make sure they don’t overlap. Notice that the publisher range is a lot bigger than the subscriber but we can sort this out before publishing the table.

So – now for a quick question – what happens if I stop the merge agent from synchronizing and insert a single row in a loop at the publisher and do the same at the subscriber. Let’s say I try to insert 50,000 records this way. Remember the declared range sizes above are 10,000 and 1,000.

Do you have an idea in mind? Will it work or fail? If it fails, at what point will it fail?

The answer, which may be surprising:

  • Publisher will insert all 50,000 records
  • Subscriber will insert 2000 records then fail with the following message:

Msg 548, Level 16, State 2, Line 5
The insert failed. It conflicted with an identity range check constraint in database ‘TestRepSub’, replicated table ‘dbo.tCities’, column ‘ID’. If the identity column is automatically managed by replication, update the range as follows: for the Publisher, execute sp_adjustpublisheridentityrange; for the Subscriber, run the Distribution Agent or the Merge Agent.
The statement has been terminated.

This is interesting. First of all we can see that the range on the subscriber is actually twice the size we declared. This is always the case. Secondly, the publisher is able to assign a new range when needed on the fly, unlike the subscriber. For the subscriber we really need to synchronize to get the next range.

The logic for this is all in the merge triggers and check constraints for those interested.

One final point is that this scenario involves single updates. You’ll have a big problem if your code does the following sort of logic to insert a lot of rows:

insert into tcities ([City])
select city from tcities

Such an insert will fail even on the publisher and requires reworking of the logic to reduce the batch size. Make sure you hit the upper limit of the declared identity range in the insert and you’ll get a new range allocated. In reality, the only way to ensure this is by using single row inserts.

How to know that a conflict has happened?

Conflicts will happen in Merge Replication. There is no error on synchronization so we need to know when this occurs, before users complain! You can continually refresh the conflict viewer all day if this is your thing ūüôā or we find something slightly more sophisticated. So,¬†I have used 2 methods:

(1) simply check the table “msmerge_conflicts_info” using a select statement. If there are records in the table send an email to the support team using sp_send_dbmail. This can easily be scripted into a regularly-running job.

(2) create a custom alert. This is not as daunting as it seems. Here is one below so you can see which counters to use…

The response will be an email to the operator defined for the alert – just have an operator for the¬†support team. Now there are drop-downs in this interface there’s no problem creating this alert.

Merge Primary Key Conflicts – how do we fix them?

Suppose we have the publisher adds a record with a PK = ‘1’ and a subscriber also adds a record with a PK = ‘1’ and then they synchronize.

As far as merge is concerned these are different rows because they have different rowguid values.

The synchronization tells us there is an issue. No error recorded but we can see something is wrong:

So next we check the conflict viewer and see more info there;

You have 3 options in the conflict viewer;

Which one would you use to¬†fix the issue? Each one removes the conflict from the conflict monitor. However – the next synchronization will simply recreate the conflict! A PK-PK conflict like this cannot be solved using these tools. You need to either update the PK value on one node or delete one of the rows, and then sync up. If there are loads of FK records hanging off this record this can become a huge pain as you’ll need to tackle these first. In fact the longer you leave it the bigger the problem becomes. So – keep an eye on the conflicts and resolve them ASAP. Alternatively add another column and use a composite PK to avoid the conflict in the first place :).

Partition Realignment – is this what you want?

Just a heads-up on part of the merge replication behavior that you need to be aware of, if you are not already :).

Suppose we have three tables – Region, Company and Sales. These are related by Primary Key-Foreign Key relationships such that Regions have Companies and Companies have Sales.

In the publication we create a filter so that Subscriber “Sub_North” gets the North Region records and subscriber “Sub_South” gets the “South” records.

Suppose we change the ACME¬†company from the North to the South region….what happens? Well first of all subscriber Sub_South will now get the company¬†ACME and all the associated ACME sales. Meanwhile, subscriber sub_North will have the ACME company deleted and all the associated ACME sales will be deleted. It is like all the data is moved from one subscriber to another. If you are new to merge replication this might catch you by surprise and we have to be aware that such a change near the top of a hierarchy can result in the same volume of traffic as a reinitialization.

It is all controlled by a parameter to sp_addmergepublication and sp_changemergepublication¬†“@allow_partition_realignment”.

By default this parameter is set to ‘true’ meaning that¬†deletes are sent to the subscriber when modification of the row on the publisher causes it to change its partition. ‘false’ leaves the data from an old partition on the Subscriber for historical purposes. However this is not any longer possible in practice to set it as ‘false’ as in this case you’ll have to redesign the whole publication and reset all articles¬†to disable uploads.

So – although it isn’t practical to reset we just need to be aware of this behavior in our design :).

 

 

What is a Compensating Change?

When the merge agent synchronizes it might be the case that an insert, update or a delete cannot be applied at the partner. This could be because of a particular constraint or trigger.

What happens in this case. By default there is a conflict reported but no error. At this stage the data is out of sync and until we sort out the conflict it’ll stay like that.

There is an alternative though. The merge agent may send a compensating change back to the source partner to undo the failed change i.e. delete the source record in the case of a failed insert at the partner. This all occurs in the same synchronization session and there is no error.

In¬†sp_addmergearticle there is a special parameter which can enable this behaviour: @compensate_for_errors. By default it is set to ‘false’.

This is not exposed in the SQL Server Management Studio so you’ll need to set it using scripts : either sp_changemergearticle or sp_addmergearticle.

When @compensate_for_errors is set to true, there will be a conflict registered in the conflict viewer. In the case of an insert being compensated by a delete, the conflict viewer will show “A row insert at ‘xxx’ could not be propagated to ‘yyy’. This failure can be caused by a constraint violation. …”.¬†The offending row still exists at the xxx partner. On the next synchronization the row is removed from xxx by a compensating delete. We have data converged at this stage. Slight issue that the conflict still exists in the conflict viewer and needs manually removing but no big deal.

When initializing do I need to drop FKs at the subscriber?

This is an interesting question. In the publication properties there is the option on the snapshot tab to run extra scripts before and after the snapshot is applied. This is the same for both merge and transactional publications (below).

Many DBAs will have scripts which drop all the FKs on the subscriber and readd them after the snapshot is applied so the initialization runs smoothly and we don’t get the following sort of error:

Could not drop object ‘dbo.tCity’ because it is referenced by a FOREIGN KEY constraint“.

However the snapshot generation is different between Transactional and Merge. In Transactional all the FKs are dropped for you and re-added later on. This doesn’t happen for merge. There’s probably a good reason for it but can’t see why they should run differently at present. Anyway, the message is that you don’t need to roll your own logic to deal with subscriber FKs in Transactional but you still do in merge!

How to see all pending merge changes

There is a handy system stored procedure sp_showpendingchanges that will provide an “approximation” of the pending changes that need to be replicated¬†at that database. This proc will list inserts/updates/deletes per article and can also filter by the subscriber server. It has been around since SQL 2008 and is a welcome change. Would be nicer if it was represented in the replication monitor like the one for transactional replication though.

You can run the procedure without any arguments in which case it provides a summary.

You can also provide all arguments as below to look at a specific subscriber and table and see the rows waiting to be sent.

exec sp_showpendingchanges @destination_server = ‘DESKTOP-NSONKMO’
, @publication = ‘Testuse_partition_groups’
, @article = ‘tCompany’
, @show_rows = 1

However, if the subscriber database name is the same as the publisher one nothing will get returned. Not great as this is a common setup!

In this case you’ll need to roll your own script using something like the code I created below:

select tCompany.* from tCompany
inner join MSmerge_contents on tCompany.rowguid = MSmerge_contents.rowguid
inner join MSmerge_genhistory on MSmerge_contents.generation = MSmerge_genhistory.generation
where MSmerge_genhistory.genstatus = 0 and MSmerge_genhistory.generation >
(select recgen from sysmergesubscriptions where subscriber_server = ‘DESKTOP-NSONKMO\paulsinst’)