Managing Peer to Peer Identity Ranges

If you’re replicating using the peer-to-peer transactional setup, at some stage you’ll want to replicate a table which has identity columns. This is not prohibited but it comes with a warning from Microsoft: “We do not recommend the use of identity columns. When using identities, you must manually manage the ranges assigned to the tables at each participating database.”. This means that out of the box there is no support for identity columns so you’ll have to be careful to avoid any issues……

(1) Reseed the identity column at each peer

We can reseed the identity range manually using DBCC CHECKIDENT. With a little imagination you can assign ranges using the seeds which will never overlap, or use more sophisticated methods based on the seed and increment like e.g. odd and even allocations or even positive and negative allocations. While this works well, there are some obvious issues:

  1. the solution requires a lot of thought if there are many subscribers, in order to assign mutually-exclusive ranges
  2. there is a lack of transparency. Which peers have which ranges? Where is it all being managed from – a spreadsheet somewhere?
  3. Setting up new peers always involves manually reseeding the range.
(2) Use a composite key

The PK could be composed of 2 columns: the identity column and additionally a LocationID column. This means that the identity column is allowed to have overlapping values from different peers and there is no manual reseeding required – the default seed and increment of (1,1) is fine. The combination of the Identity column and the LocationID is always unique across peers, because each peer has a different LocationID. The added benefit is that the record is immediately identifiable in terms of the originator as we simply refer to the value of the LocationID.

(3) Use a persisted computed column

This is similar to the solution above and I came across this from the excellent RBI Dev DBA Group. In this case we now have three columns: the Identity column, the LocationID column and a third persisted computed column that combines the two:

LocationID ID PK_ID
10000 1 10000+1=10001
10000 2 10000+2=10002

The PK_ID is the persisted computed column and is set as the primary key. Now we have a narrow primary key and also a guarantee of uniqueness across peers. To be honest I haven’t used persisted computed columns much and at first I thought that it wouldn’t work because the replication-generated insert stored procedure wouldn’t cater for the automatically-computed PK value, or that the updates to LocationID would be “confused” on synchronization with the PK itself having changed. Anyway, I tested it and it all works fine. The persisted column formula was simply the LocationID + Identity – with a large LocationID as a seed to ensure no overlap.

Conceptually this is very similar to (1) with the LocationID performing a similar function to the seed of the DBCC CHECKIDENT. However IMO it is far more preferable as it is completely transparent. If you chose to implement this solution you’ll just need to allocate a different default LocationID to each peer and ensure they don’t overlap. This can be done via an additional mapping table maintained in the publication which maps HostName to LocationID, and using a scalar function as the default for the LocationID column which performs the lookup. This means that a backup from one server can be used as a starting point for another peer with no intervention at all assuming the mapping table is already populated.

How fast is my Transactional replication throughout the day?

If you want to test the speed of your Transactional replication there is a neat functionality called Tracer Tokens. This will tell you how quick is it to get a record from the publisher to the distributor and from there to the subscriber. In the old days you’d have to create a dummy table to do all this but since SQL 2005 it’s out of the box :).

To add a tracer token use the replication monitor and edit the Transactional publication as below:

Basically, a token is written to the transaction log of the publisher. This is treated as a normal replicated command, and passes from the transaction log to the distribution database and then is ‘run’ by the distribution agent.

You’ll see the various latency times recorded for each subscription. If you have a distribution agent set to work on a schedule (i.e. not running continuously) you’ll see the following sort of scenario until the distribution agent runs.

To add one programatically for reporting purposes just roll the following type of script:

DECLARE @myTokenID AS int
EXEC sp_posttracertoken
@publication = ‘YourPubName’,
@tracer_token_id = @myTokenID OUTPUT;
select @myTokenID 

Trapping the TokenID is actually optional but is very useful if you want to check on the resulting latencies for this particular token. So,

EXEC sp_helptracertokenhistory
@publication = ‘YourPubName’,
@tracer_id = -2147483641 

where -2147483641 is the ID of the tracer token.

 

Or just roll your own report based on;

select * from distribution..MStracer_tokens

This way you can create a custom alert so the support team know when there is an issue. There are SQL Agent alerts based on the Log Reader Agent and Distribution Agents which can be configured, but the method above is a little more under our control and can be used for reporting purposes. This way you can add a Tracer Token every 10 minutes and plot performance over the course of a day.

How to know that a conflict has happened?

Conflicts will happen in Merge Replication. There is no error on synchronization so we need to know when this occurs, before users complain! You can continually refresh the conflict viewer all day if this is your thing 🙂 or we find something slightly more sophisticated. So, I have used 2 methods:

(1) simply check the table “msmerge_conflicts_info” using a select statement. If there are records in the table send an email to the support team using sp_send_dbmail. This can easily be scripted into a regularly-running job.

(2) create a custom alert. This is not as daunting as it seems. Here is one below so you can see which counters to use…

The response will be an email to the operator defined for the alert – just have an operator for the support team. Now there are drop-downs in this interface there’s no problem creating this alert.

Merge Primary Key Conflicts – how do we fix them?

Suppose we have the publisher adds a record with a PK = ‘1’ and a subscriber also adds a record with a PK = ‘1’ and then they synchronize.

As far as merge is concerned these are different rows because they have different rowguid values.

The synchronization tells us there is an issue. No error recorded but we can see something is wrong:

So next we check the conflict viewer and see more info there;

You have 3 options in the conflict viewer;

Which one would you use to fix the issue? Each one removes the conflict from the conflict monitor. However – the next synchronization will simply recreate the conflict! A PK-PK conflict like this cannot be solved using these tools. You need to either update the PK value on one node or delete one of the rows, and then sync up. If there are loads of FK records hanging off this record this can become a huge pain as you’ll need to tackle these first. In fact the longer you leave it the bigger the problem becomes. So – keep an eye on the conflicts and resolve them ASAP. Alternatively add another column and use a composite PK to avoid the conflict in the first place :).

Partition Realignment – is this what you want?

Just a heads-up on part of the merge replication behavior that you need to be aware of, if you are not already :).

Suppose we have three tables – Region, Company and Sales. These are related by Primary Key-Foreign Key relationships such that Regions have Companies and Companies have Sales.

In the publication we create a filter so that Subscriber “Sub_North” gets the North Region records and subscriber “Sub_South” gets the “South” records.

Suppose we change the ACME company from the North to the South region….what happens? Well first of all subscriber Sub_South will now get the company ACME and all the associated ACME sales. Meanwhile, subscriber sub_North will have the ACME company deleted and all the associated ACME sales will be deleted. It is like all the data is moved from one subscriber to another. If you are new to merge replication this might catch you by surprise and we have to be aware that such a change near the top of a hierarchy can result in the same volume of traffic as a reinitialization.

It is all controlled by a parameter to sp_addmergepublication and sp_changemergepublication “@allow_partition_realignment”.

By default this parameter is set to ‘true’ meaning that deletes are sent to the subscriber when modification of the row on the publisher causes it to change its partition. ‘false’ leaves the data from an old partition on the Subscriber for historical purposes. However this is not any longer possible in practice to set it as ‘false’ as in this case you’ll have to redesign the whole publication and reset all articles to disable uploads.

So – although it isn’t practical to reset we just need to be aware of this behavior in our design :).

 

 

What is a Compensating Change?

When the merge agent synchronizes it might be the case that an insert, update or a delete cannot be applied at the partner. This could be because of a particular constraint or trigger.

What happens in this case. By default there is a conflict reported but no error. At this stage the data is out of sync and until we sort out the conflict it’ll stay like that.

There is an alternative though. The merge agent may send a compensating change back to the source partner to undo the failed change i.e. delete the source record in the case of a failed insert at the partner. This all occurs in the same synchronization session and there is no error.

In sp_addmergearticle there is a special parameter which can enable this behaviour: @compensate_for_errors. By default it is set to ‘false’.

This is not exposed in the SQL Server Management Studio so you’ll need to set it using scripts : either sp_changemergearticle or sp_addmergearticle.

When @compensate_for_errors is set to true, there will be a conflict registered in the conflict viewer. In the case of an insert being compensated by a delete, the conflict viewer will show “A row insert at ‘xxx’ could not be propagated to ‘yyy’. This failure can be caused by a constraint violation. …”. The offending row still exists at the xxx partner. On the next synchronization the row is removed from xxx by a compensating delete. We have data converged at this stage. Slight issue that the conflict still exists in the conflict viewer and needs manually removing but no big deal.

“Keep existing table unchanged” – why would I ever want to use this?

There is an option on the article properties to “keep existing object unchanged”.

This seems a little odd at first. Surely we always drop all subscriber objects when we reinitialize? Well for a particular table we could have something different in mind. A real edge case but there is a scenario this is designed for. It is where we have multiple publishers to a single subscriber. The table is created on the subscriber by the first publisher during initialization and hence has the replication objects are created at the subscriber. Subsequent publishers will send their commands and records there but leave the table in place. This can be used for centralised reporting from several municipal offices to head office. Any ddl changes such as adding a column are fine – we do them on the first publisher (the one which did the initial drop and recreate at the subscriber) so the ddl change goes to the subscriber.

When initializing do I need to drop FKs at the subscriber?

This is an interesting question. In the publication properties there is the option on the snapshot tab to run extra scripts before and after the snapshot is applied. This is the same for both merge and transactional publications (below).

Many DBAs will have scripts which drop all the FKs on the subscriber and readd them after the snapshot is applied so the initialization runs smoothly and we don’t get the following sort of error:

Could not drop object ‘dbo.tCity’ because it is referenced by a FOREIGN KEY constraint“.

However the snapshot generation is different between Transactional and Merge. In Transactional all the FKs are dropped for you and re-added later on. This doesn’t happen for merge. There’s probably a good reason for it but can’t see why they should run differently at present. Anyway, the message is that you don’t need to roll your own logic to deal with subscriber FKs in Transactional but you still do in merge!