SQL Data Sync and Conflicts

Hi all. I just wanted to do a few articles around the use of SQL Data Sync as most of my work these days seems to be based on the cloud versions of SQL in one form or another. For those who don’t know, SQL Data Sync is the native bi-directional replication framework which is provided for use in SQL Azure.

Data is replicated between a hub (publisher) and a member (subscriber). Funny how the groups who work on replication, data sync, log shipping and failover groups all choose quite different names for these same logical concepts! For those who haven’t seen SQL Data Sync before is a bit like a cross between merge and peer-to-peer transactional replication but it has a few extra idiosyncrasies. First of all the data changes are logged by insert, update and delete triggers to audit tables – one audit table for each user table. There is no requirement to have a rowguid column but each table must have a primary key. The sync framework which is used is provided as a service in Azure SQL so we have done away with the idea of traditional agents. The synchronization can be triggered manually in the portal or set up with a schedule – also in the portal. With SQL Data Sync the tables are created on the member servers without FK constraints if the corresponding PK table isn’t being included in the Synchronization group. I won’t go over all the details here – that’s just enough so that you can broadly compare this architecture to vanilla replication and see how I did the tests below. I’ll do other articles on setting it all up, making schema changes, logging and other topics later on. For now I want to concentrate on the different data change use cases, the conflict resolution methodology and the behavior we should expect. Remember that conflicts in merge replication and peer-to-peer transactional replication are sometimes resolved in strange ways and attempted conflict resolution can cause agent failures. Let’s see how the SQL Data Sync compares….

In my setup I included 2 simple Northwind tables – Regions and Territories.

For some conflict tests I needed a single table so only one of the two was included. I specified that in any conflict resolution the Hub would win over the Member. These are the scenarios I examined…

Update-Update

Here an update of the territorydescription on the Hub simply overwrites a different territorydescription update made on the Member. The Hub was set to win in conflicts so that is as expected.

Update-Delete

A Territories record was deleted on the Hub and the same record updated on the Member. The delete on the Hub removes the updated record on the Member. If we did the inverse – the update was done on the Hub and the delete at the Member – the updated record is reinserted on the Member. This is interesting and potentially very useful as it means that updates can be used to replace missing records. I’ll do another separate article soon on how records can become “missing” using SQL Data Sync!

Failed Syncs
(a) Failed Insert at Hub

The insert from Member could not be applied at the Hub. For this test I just used the Territories table in the Sync Group, but the Hub had both Territories and Region tables with a FK constraint between them. In this case an insert of a Territory on the Member fails on the Hub because the associated Region record (10) is missing:

Insert Into Territories Values (‘xxx’,’bbb’,10)

The error message below is returned.

Surprisingly, this is not resolved with a delete of the offending record at the Member. From the message we don’t know which is the problem record but remember that these changes are recorded. In this case the insert trigger will have written a row to the tracking table. To find the problem record we can examine the Territories_dss_tracking table and look at the latest records which haven’t been synced.

SELECT *
FROM [DataSync].[Territories_dss_tracking]
where scope_update_peer_key is null
order by last_change_datetime desc

Then we delete the offending record from the Territories table at the Member database, resync and all is well.

 

(b) Failed Delete at Hub

For this test I restricted the Sync Group to only include the Region table. Both tables were still on the Hub with a FK relationship between them, but only the Region table was on the Member. Deleting the “Eastern” region record on the Member syncs back to the Hub as a failed Sync due to a FK violation as there are related Territories. Again, the record is not reinserted at the Member and you’ll have to try to find the details of the offending delete and manually sort it out.

 

SELECT *
FROM [DataSync].[Region_dss_tracking]
where sync_row_is_tombstone = 1
order by last_change_datetime desc

Interestingly I had assumed that I’d have to edit the tracking table data. I’d assumed that a reinsert of the “Eastern” region at the Member would fail because syncing would try to add a duplicate “Eastern” region at the Hub. However that is not the case. You can reinsert at the Member, sync and all is ok. In fact you could add a region called “East” with a PK of 1 on the Member and this works fine and syncs up as an update at the Hub.

Conclusions

Hopefully you can get a sense of how data changes are being propagated. As I mentioned above it is quite distinct from peer-to-peer transactional and merge replication conflict behavior. In some cases this might work out of the box in an “everyone can change everything” sort of topology, but I’d expect not and you’ll need to understand what the logged messages mean and how to fix any data non-convergence afterwards. I’m hoping that this article will help in this sort of case!

Leave a Reply

Your email address will not be published.