If you’re replicating using the peer-to-peer transactional setup, at some stage you’ll want to replicate a table which has identity columns. This is not prohibited but it comes with a warning from Microsoft: “We do not recommend the use of identity columns. When using identities, you must manually manage the ranges assigned to the tables at each participating database.”. This means that out of the box there is no support for identity columns so you’ll have to be careful to avoid any issues……
(1) Reseed the identity column at each peer
We can reseed the identity range manually using DBCC CHECKIDENT. With a little imagination you can assign ranges using the seeds which will never overlap, or use more sophisticated methods based on the seed and increment like e.g. odd and even allocations or even positive and negative allocations. While this works well, there are some obvious issues:
- the solution requires a lot of thought if there are many subscribers, in order to assign mutually-exclusive ranges
- there is a lack of transparency. Which peers have which ranges? Where is it all being managed from – a spreadsheet somewhere?
- Setting up new peers always involves manually reseeding the range.
(2) Use a composite key
The PK could be composed of 2 columns: the identity column and additionally a LocationID column. This means that the identity column is allowed to have overlapping values from different peers and there is no manual reseeding required – the default seed and increment of (1,1) is fine. The combination of the Identity column and the LocationID is always unique across peers, because each peer has a different LocationID. The added benefit is that the record is immediately identifiable in terms of the originator as we simply refer to the value of the LocationID.
(3) Use a persisted computed column
This is similar to the solution above and I came across this from the excellent RBI Dev DBA Group. In this case we now have three columns: the Identity column, the LocationID column and a third persisted computed column that combines the two:
LocationID | ID | PK_ID |
10000 | 1 | 10000+1=10001 |
10000 | 2 | 10000+2=10002 |
The PK_ID is the persisted computed column and is set as the primary key. Now we have a narrow primary key and also a guarantee of uniqueness across peers. To be honest I haven’t used persisted computed columns much and at first I thought that it wouldn’t work because the replication-generated insert stored procedure wouldn’t cater for the automatically-computed PK value, or that the updates to LocationID would be “confused” on synchronization with the PK itself having changed. Anyway, I tested it and it all works fine. The persisted column formula was simply the LocationID + Identity – with a large LocationID as a seed to ensure no overlap.
Conceptually this is very similar to (1) with the LocationID performing a similar function to the seed of the DBCC CHECKIDENT. However IMO it is far more preferable as it is completely transparent. If you chose to implement this solution you’ll just need to allocate a different default LocationID to each peer and ensure they don’t overlap. This can be done via an additional mapping table maintained in the publication which maps HostName to LocationID, and using a scalar function as the default for the LocationID column which performs the lookup. This means that a backup from one server can be used as a starting point for another peer with no intervention at all assuming the mapping table is already populated.