Average 28000 requests/month












 

Replication Republishing

Introduction

Republishing in a replication scenario occurs when a subscriber to articles from a publisher is itself a publisher of these same articles. In transactional replication, changes made at the publisher will propagate in a double-hop to the final subscriber. In merge replication, changes may be made by any link in the chain and will in turn propagate to all other links. This sounds like an unnecessarily complicated setup, so why would you want to arrange this topology? The most obvious case is when you have a particularly slow network connection. By using republishing, you ensure that data is only sent once over the slow link, and faster distribution occurs at the other side of the link:

The shipping industry is the main area where I have come across the need for this technology. The satellite links to a ship are slow and often unreliable (large waves obscure the receiver and consequently break the connection), so rather than replicating from head office to several databases on each ship, we replicate once to the ship and then republish. 
This article will examine how to set up republishing on SQL Server 2005 although the basic setup is much the same for SQL Server 2000.

Setting it up

For the most part, the setup is straightforward, provided you pay attention to a few key areas. Let's look at the initial data migration then the required 'tweaks' to transactional and merge replication separately:

The Initialization Process

Provided the topology is set up from left to right in the above diagram, there are not really any difficulties, and normal initializations can be used. Of course this is the easiest method. In the case of P1 being set up later than P2, we could use a backup and restore method, however this becomes very complicated and is IMO best left alone. For a start, P2 must be treated as RO or else there will be non-convergence of data. Next, there will be issues of identity columns because in the correct setup only tables on P1 should have the identity attribute, and not P2, S3 or S4.

Merge

(a) type of subscription

There are 2 types of merge subscription in SQL Server 2005: "Client" and "Server". Client subscribers assume the priority of the publisher during conflict resolution with other subscribers, while Server subscribers are each assigned priority numbers, with the highest values winning. In SQL Server 2000 these are referred to as "Local" and "Global" (Local=Client, Global=Server). Only Server subscribers  (those having priority numbers) can republish data to other subscribers. So, in the diagram above, P2(S) should be set up as a subscriber to P1 with a priority number assigned before it is allowed to republish. Failure to do so will result in the following message (at the bottom). This error message still uses the terminology of SQL Server 2000, although the meaning is clear.

So, the subscriber P2(S) must have a priority number assigned, as must S3 and S4. The particular values must also be chosen carefully: P1 > P2(S) > S3/4. If you've partitioned the data to ensure no conflicts, this is probably a bit academic, however it is still mandatory.

(b) Identity Ranges

Identity ranges are managed by the publisher and propagated to subscribers by the merge agent. As we have a republishing hierarchy, the ranges are managed both by the root publisher and the republisher. As per usual, the identity values are assigned from a pool at the publisher, but the subscribers S3 and S4 must be assigned their range from the P2(S) values. If the subscriber PS(2) has a range of 100 assigned to it from the Publisher P1, this means that PS(2), S3 and S4 (and any other outer subscribers) must together amount to 100 (controlled by the article properties of the republisher). This means that the size of the range itself limits the number of outer subscribers - something worth noting if you are planning for a lot of subscribers. Fortunately the default size of the ranges has increased vastly in SQL Server 2005 to 1000 per subscriber by default so forgetting to check this is less likely to be a problem in cases with a few republishing subscribers. When S3 runs out of values it requests them from P2 and when all of P2(S)s have run out or when S3 requests another range and there aren't sufficient values available, a new range is requested by PS(2) from P1. Whether this is from the merge agent or via a call to sp_adjustpublisheridentityrange doesn't really matter - it follows the same rules as for a normal merge subscription.

Transactional

Transactional replication is limited to the standard vanilla setup ie you are not able to use updating Subscribers. Also worth noting is the fact that in transactional replication, tracer tokens are not forwarded by republishing Subscribers. In this case we could add a tracer token to P1 and add one to P2(S) independently.

Schema Changes, Retention Periods and SQL Server 2000 Considerations

Schema changes should be made only at the publisher, not at the republishing subscriber. Changes propagated to a republishing  subscriber are by default propagated to its subscribers.  In SQL Server 2000 we would use sp_repladdcolumn and sp_repldropcolumn but these are NOT supported in a republishing tolology: in the above diagram, if we added a column to P1 it would be added to P1 and P2(S) but NOT S3 or S4, and so result in a need to reinitialize. In SQL Server 2005 the ALTER TABLE statement can be used provided @replicate_ddl is set to 1, and new columns added to P1 will successfully be added to P2(S) and subsequently to S3 and S4. The retention period for any republisher must be set to a value equal to or less than the retention period set at the original publisher and using different values may lead to non-convergence.

 

 
 

May 2006