As promised, here is the 2nd article in my series based on SQL Data Sync. SQL Data Sync is the replication framework provided as a cloud service in Azure. You can find more background details in my article on Conflicts in SQL Data Sync and I’ll do another article on the basics of setting it all up later on. In this article we’ll look at various Sync Group and schema changes – adding and removing tables and making changes to existing tables – and see how this works and what we have to be careful with.
Adding a new table
Just select the ellipsis button on the tables icon then put a tick in the box of the new table article. If it isn’t shown there you can refresh the schema.
Nothing happens immediately – you’ll need to Sync up for the additional table to propagate to the Member servers.
If you need to check exactly what is in the Sync group – tables, columns, triggers etc – you can run this script at the Member :
SELECT config_data
FROM [DataSync].[scope_config_dss]
In our case this will return an xml string which can be opened in SSMS to show that the new table has been added to the group.
We have to be a little careful here – if the table already exists on the Member, it will not be dropped and recreated, unlike during the default application of a snapshot in traditional replication. There is no option to force a drop and create. SQL Data Sync assumes that if the table is already there, the data is in sync. So if there are data discrepancies they will remain in the Member table and can cause problems later on.
You could fix the data yourself provided you know how to find the data discrepancies. Updating all the rows on the Hub will add in any missing ones at the Member, and update any existing but different rows. Extra rows at the Member will be harder to sort out in a simple fashion.
So unless you are 100% sure you understand how to find and fix all data differences between the hub and the Member, it is safest to delete the table at the Member and then add it in from scratch by relying on an initial Sync.
As mentioned above, if the table doesn’t already exist on the Member, it will be created from scratch and populated, so all is well.
Removing a Table
Deselecting a table from the Sync Group and starting a Sync will remove the user table triggers and the tracking table at the Member. The user table itself is not removed from the Member.
Adding a Column
The new column must be added on the Hub via an Alter Table command. Making SQL Data Sync aware of this schema change is not automatic – there’s no equivalent of @replicate_ddl with SQL Data Sync. You can create a DDL trigger to alert you about schema changes if you are concerned it’ll happen without you being aware of it (details here). Once the extra column is added – in this example I add it to the Region table – you’ll need to let SQL Data Sync know it has been added. But we must first add the extra column to the Member servers as well as the Hub – all manually.
Once the extra column has been added, in the portal we click on “Refresh Schema” and put a tick in the box for the new column “testextracol”.
After running a Sync, changes will propagate as expected, because the new column is part of the SQL Data Sync schema. We can check that this is the case as before using:
SELECT config_data
FROM [DataSync].[scope_config_dss]
If you add the column on the Hub and then forget to add it to the Sync Group there is no real issue. The Sync Group treats it like a vertical partition. And remember that the tracking table only really holds the PK values – not all the other columns and data.
Removing a Column
Removing a column in the portal changes the scope_config – the table used when the comparisons are made at a Sync. The tracking table holds the PK values only, so the column removal doesn’t make a big change. Remember however that we are really changing the metadata of the Sync Group. Removing a column in the portal will not remove the actual column in the table at the Member. This is all well if the column is NULLable, but if eg I remove the RegionDescription column at the Hub and then do an insert, the insert will fail during the Sync as mentioned below. So the simplest solution is that either we replicate the whole table, or the schemas at the Member should match what is defined on the portal.
Changing a Column
Changing datatypes can be troublesome sometimes. Suppose on the Hub I increase the RegionDescription column from varchar(50) to varchar(500). Then I edit a RegionDescription value to be 100 characters long. Syncing up will result in a truncation error.
Although not mentioned explicitly 8152 means “String or binary data would be truncated”. So – we have to either keep the schemas in sync, or the data has to match the lowest column specifications.
Well – that’s all for now. I’ll add in any other schema change idiosyncrasies as I come across them.