SQL Data Sync – Group Changes and Schema Changes

As promised, here is the 2nd article in my series based on SQL Data Sync. SQL Data Sync is the replication framework provided as a cloud service in Azure. You can find more background details in my article on Conflicts in SQL Data Sync and I’ll do another article on the basics of setting it all up later on. In this article we’ll look at various Sync Group and schema changes – adding and removing tables and making changes to existing tables – and see how this works and what we have to be careful with.

Adding a new table

Just select the ellipsis button on the tables icon then put a tick in the box of the new table article. If it isn’t shown there you can refresh the schema.

Nothing happens immediately – you’ll need to Sync up for the additional table to propagate to the Member servers.

If you need to check exactly what is in the Sync group – tables, columns, triggers etc –  you can run this script at the Member :

SELECT config_data
FROM [DataSync].[scope_config_dss]

In our case this will return an xml string which can be opened in SSMS to show that the new table has been added to the group.

We have to be a little careful here – if the table already exists on the Member, it will not be dropped and recreated, unlike during the default application of a snapshot in traditional replication. There is no option to force a drop and create. SQL Data Sync assumes that if the table is already there, the data is in sync. So if there are data discrepancies they will remain in the Member table and can cause problems later on.

You could fix the data yourself provided you know how to find the data discrepancies. Updating all the rows on the Hub will add in any missing ones at the Member, and update any existing but different rows. Extra rows at the Member will be harder to sort out in a simple fashion.

So unless you are 100% sure you understand how to find and fix all data differences between the hub and the Member, it is safest to delete the table at the Member and then add it in from scratch by relying on an initial Sync.

As mentioned above, if the table doesn’t already exist on the Member, it will be created from scratch and populated, so all is well.

Removing a Table

Deselecting a table from the Sync Group and starting a Sync will remove the user table triggers and the tracking table at the Member. The user table itself is not removed from the Member.

Adding a Column

The new column must be added on the Hub via an Alter Table command. Making SQL Data Sync aware of this schema change is not automatic – there’s no equivalent of @replicate_ddl with SQL Data Sync. You can create a DDL trigger to alert you about schema changes if you are concerned it’ll happen without you being aware of it (details here). Once the extra column is added – in this example I add it to the Region table – you’ll need to let SQL Data Sync know it has been added. But we must first add the extra column to the Member servers as well as the Hub – all manually.

Once the extra column has been added, in the portal we click on “Refresh Schema” and put a tick in the box for the new column “testextracol”.

After running a Sync, changes will propagate as expected, because the new column is part of the SQL Data Sync schema. We can check that this is the case as before using:

SELECT config_data
FROM [DataSync].[scope_config_dss]

If you add the column on the Hub and then forget to add it to the Sync Group there is no real issue. The Sync Group treats it like a vertical partition. And remember that the tracking table only really holds the PK values – not all the other columns and data.

Removing a Column

Removing a column in the portal changes the scope_config – the table used when the comparisons are made at a Sync. The tracking table holds the PK values only, so the column removal doesn’t make a big change. Remember however that we are really changing the metadata of the Sync Group. Removing a column in the portal will not remove the actual column in the table at the Member. This is all well if the column is NULLable, but if eg I remove the RegionDescription column at the Hub and then do an insert, the insert will fail during the Sync as mentioned below. So the simplest solution is that either we replicate the whole table, or the schemas at the Member should match what is defined on the portal.

Changing a Column

Changing datatypes can be troublesome sometimes. Suppose on the Hub I increase the RegionDescription column from varchar(50) to varchar(500). Then I edit a RegionDescription value to be 100 characters long. Syncing up will result in a truncation error.

Although not mentioned explicitly 8152 means “String or binary data would be truncated”. So – we have to either keep the schemas in sync, or the data has to match the lowest column specifications.

Well – that’s all for now. I’ll add in any other schema change idiosyncrasies as I come across them.

SQL Data Sync and Conflicts

Hi all. I just wanted to do a few articles around the use of SQL Data Sync as most of my work these days seems to be based on the cloud versions of SQL in one form or another. For those who don’t know, SQL Data Sync is the native bi-directional replication framework which is provided for use in SQL Azure.

Data is replicated between a hub (publisher) and a member (subscriber). Funny how the groups who work on replication, data sync, log shipping and failover groups all choose quite different names for these same logical concepts! For those who haven’t seen SQL Data Sync before is a bit like a cross between merge and peer-to-peer transactional replication but it has a few extra idiosyncrasies. First of all the data changes are logged by insert, update and delete triggers to audit tables – one audit table for each user table. There is no requirement to have a rowguid column but each table must have a primary key. The sync framework which is used is provided as a service in Azure SQL so we have done away with the idea of traditional agents. The synchronization can be triggered manually in the portal or set up with a schedule – also in the portal. With SQL Data Sync the tables are created on the member servers without FK constraints if the corresponding PK table isn’t being included in the Synchronization group. I won’t go over all the details here – that’s just enough so that you can broadly compare this architecture to vanilla replication and see how I did the tests below. I’ll do other articles on setting it all up, making schema changes, logging and other topics later on. For now I want to concentrate on the different data change use cases, the conflict resolution methodology and the behavior we should expect. Remember that conflicts in merge replication and peer-to-peer transactional replication are sometimes resolved in strange ways and attempted conflict resolution can cause agent failures. Let’s see how the SQL Data Sync compares….

In my setup I included 2 simple Northwind tables – Regions and Territories.

For some conflict tests I needed a single table so only one of the two was included. I specified that in any conflict resolution the Hub would win over the Member. These are the scenarios I examined…

Update-Update

Here an update of the territorydescription on the Hub simply overwrites a different territorydescription update made on the Member. The Hub was set to win in conflicts so that is as expected.

Update-Delete

A Territories record was deleted on the Hub and the same record updated on the Member. The delete on the Hub removes the updated record on the Member. If we did the inverse – the update was done on the Hub and the delete at the Member – the updated record is reinserted on the Member. This is interesting and potentially very useful as it means that updates can be used to replace missing records. I’ll do another separate article soon on how records can become “missing” using SQL Data Sync!

Failed Syncs
(a) Failed Insert at Hub

The insert from Member could not be applied at the Hub. For this test I just used the Territories table in the Sync Group, but the Hub had both Territories and Region tables with a FK constraint between them. In this case an insert of a Territory on the Member fails on the Hub because the associated Region record (10) is missing:

Insert Into Territories Values (‘xxx’,’bbb’,10)

The error message below is returned.

Surprisingly, this is not resolved with a delete of the offending record at the Member. From the message we don’t know which is the problem record but remember that these changes are recorded. In this case the insert trigger will have written a row to the tracking table. To find the problem record we can examine the Territories_dss_tracking table and look at the latest records which haven’t been synced.

SELECT *
FROM [DataSync].[Territories_dss_tracking]
where scope_update_peer_key is null
order by last_change_datetime desc

Then we delete the offending record from the Territories table at the Member database, resync and all is well.

 

(b) Failed Delete at Hub

For this test I restricted the Sync Group to only include the Region table. Both tables were still on the Hub with a FK relationship between them, but only the Region table was on the Member. Deleting the “Eastern” region record on the Member syncs back to the Hub as a failed Sync due to a FK violation as there are related Territories. Again, the record is not reinserted at the Member and you’ll have to try to find the details of the offending delete and manually sort it out.

 

SELECT *
FROM [DataSync].[Region_dss_tracking]
where sync_row_is_tombstone = 1
order by last_change_datetime desc

Interestingly I had assumed that I’d have to edit the tracking table data. I’d assumed that a reinsert of the “Eastern” region at the Member would fail because syncing would try to add a duplicate “Eastern” region at the Hub. However that is not the case. You can reinsert at the Member, sync and all is ok. In fact you could add a region called “East” with a PK of 1 on the Member and this works fine and syncs up as an update at the Hub.

Conclusions

Hopefully you can get a sense of how data changes are being propagated. As I mentioned above it is quite distinct from peer-to-peer transactional and merge replication conflict behavior. In some cases this might work out of the box in an “everyone can change everything” sort of topology, but I’d expect not and you’ll need to understand what the logged messages mean and how to fix any data non-convergence afterwards. I’m hoping that this article will help in this sort of case!