Home  |  Articles  |  Scripts  |  Consultancy  |  FAQs  |  Links

 
Designing Accurate Filters and Joins in Merge Replication

article by Paul Ibison

 

Introduction

This article tries to explain the use of filters and joins in merge replication. We'll look at a particular schema and see what the choices are and how they behave in practice. 

A while back I had to do a couple of days merge replication consultancy at a local firm. They had a schema which included some tables much like the ones below. The schema was already being used in production and the client wanted to add the Customer and Sales articles to an existing merge publication in the best way possible. Filters were already being used in the publication based on BranchID where a subscriber essentially equates to a branch - the Branch table was already an existing published article. For security, each subscriber should only see its own branch's details. Whether the filters were dynamic or static doesn't really matter for this article (they were in fact dynamic). All seems ok so far? So, a branch has customers, and a customer has sales. A customer in this particular model could belong to only one single branch. However a customer could at some stage change allegiance to another branch. This is a rare process but is indeed possible.

The database designers reading this description will immediately see flaws in the design, and I have added to the appendix what I believe would be a more appropriate design for the database. However in this particular case the design had already been made live and the application coded against it, and a certain amount of application logic kind of made up for the flaws in the database design. Anyway, this article is not about database design! It really looks at what options are available using merge replication filters and joins to make this work, and what are the ramifications of each design. To this extent the article is a generic one. We have to be very careful in making the correct choices with merge filters and joins as once the publication design has been chosen it is often difficult to change afterwards without resorting to reinitialization and in some cases an incorrect design will mean that the data will simply not flow properly.

Publication Properties - "Filter Rows"

So, having added the Customer and Sale articles above to the publication, we have to design the filters. This is the "Filter Rows" section in the publication Properties dialog box. I've proposed 4 options below to make sure the correct branch data gets to the subscriber and we'll look at each one in turn and try to understand what each one means in terms of the publication behaviour.

Option A

The first option uses no joins. It takes advantage of the fact that the BranchID appears in each table so we can simply filter directly on this column for each table e.g. we could have BranchID = 1 as a filter clause on each article.

Why is this an "advantage"? Remember that the rows which are to be replicated are determined by the merge agent using views. If we take a look at the execution plan difference between a join and a direct filter we'll see something like the plans below. The first plan refers to the join between Customer and Branch to obtain the Customers to be replicated. The join has to get the rows from the Branch clustered index and then loop through them to find the related Customers. The second plan is a direct filter on the Customer table. This second plan is half the cost of the join filter (Query cost is 33%). We have established that direct filters are faster. The more the joins are nested, the slower they are in comparison to a direct filter.

How does the data flow though? Initially all will be ok - a subscriber will see the relevant Branch, Customers and Sales. However, if a customer changes to another branch, what happens to his Sales? As we filter the Sales on BranchID, they remain with the Branch details at the subscriber - generally good as each Branch/shop should see the sales they have made, and not suddenly have a drop in historical revenues. The problem is that when the Customer changes to another Branch, the merge agent will send a delete to the original subscriber of that Customer record. After such a delete, the Sales will still exist at the subscriber but will then belong to a non-existent Customer!

Option B

This option uses joins which mimic the referential integrity established on the publisher. Generally this is our first reflex and it seems like an obvious choice.

Like Option A, all will be well to start off with and the subscriber (Branch) will get all the correct Customers and Sales. Once a Customer changes Branches though you get an aspect of merge replication which is often incredibly useful but in this case disastrous. All the Sales belonging to the Customer will move with him to the other subscriber (Branch) and deletes will be sent to the original subscriber to remove the particular Customer and all their Sales. Imagine if this was a core customer, the Branch could run a sales report and find a huge chunk of revenues have disappeared.

Option C

This option is essentially the same as Option B but the Sale article appears twice. It has a join to the Customer table and also appears again with a filter on BranchID.

This highlights an interesting aspect of merge joins and filters. When they are combined, there is effectively a UNION of records, rather than the separate clauses combining to further restrict the records returned.

This is summarised in the entry for sp_addmergefilter in BOL: "When both a parameterized row filter and a join filter are used on a table article, replication determines whether a row belongs in a Subscriber's partition. It does so by evaluating either the filtering function or the join filter (using the OR operator), rather than evaluating the intersection of the two conditions (using the AND operator)."

Consequently, the Sales belonging to a Customer are sent to a Branch, along with the Sale records belonging to the Branch. Initially this amounts to the same behaviour as above. However once a Customer moves to another Branch, he will take his Sales with him, but they will not be deleted from the initial Branch due to the filter. In practice there is a bug here because the Sales will be removed, although logically they should not.

If we run the following TSQL:

update sale set branchid = branchid

This will readd the missing Sale records to the original subscriber.

This sounds more interesting - the Branch keeps hold of his individual Sale records. Even so, we still will have 2 issues:

  • the Sale records belong to a now non-existent Customer at the original Branch

  • the Sale records will now be visible at the new Branch even though they belong to another Branch.

Option D

This option reverses the direction of the referential integrity. Remember that a Branch has Customers and Customers have Sales. We now declare the join filters in the opposite sense - a Branch has Sales and the Sales have Customers. Customers not yet having made Sales are also required, so we also define a separate join between Branch and Customer.

On initialization, Customers belonging to a Branch are replicated to the subscriber. Sales belonging to a Branch are also replicated. Finally, Customers who 'own' a Sale from a Branch are replicated, even if they now belong to another Branch. The only direct filter is on Branch and the rest are join filters. When we change a Customer to another Branch, the original Branch will still see this Customer if he has made some Sales, and the Sales belonging to the Branch will also still be visible! We could add a join filter between Customer and Sale but this would be redundant in this case.

Other Considerations

Clearly Option D will suit the needs of this client. However are there any other considerations when selecting the correct filter/join combination? Well in SQL Server 2005/8 we are now able to drop merge articles. In SQL Server 2000 we would have to drop the entire set of subscriptions and then remove an article from the publication - effectively reinitialize. Now in SQL 2005/8 we can use sp_dropmergearticle to remove an article - we still have to create a snapshot to reset the metadata - but we're spared from reinitializing. Is this always the case? - No! BOL explains it like this "If an article is a parent article in a join filter or logical record relationship, the relationships must be dropped first, which requires reinitialization". In the case of Option D we see that if we were to drop the Sale article, this would force a reinitialization. Just something to be aware of if the choice is possible between joins and filters.

Conclusions

I hope this helps clarify some of the basic ramifications of selecting between the different options when defining the filters and joins. I'm sure this isn't inclusive and I'll add more details to this article over time.

Appendix

I'm no great database designer, but I'd expect the schema should be something like this to start off with. A Sale belongs to a BranchCustomer. A Customer can now have several Branches, but only one live one as defined by the "Current" flag in BranchCustomer. Merge replication would define the join filters in a more straightforward sense for this, although they would still not mimic directly the direction of the DRI as Customer would be defiled as a child of BranchCustomer.