Basic Comparison of Replication Times between Merge and Transactional ReplicationIntroductionThis is a simple comparison of the times taken to process different types of replication. Sometimes there is a business requirement that obviously seems suitable for replication and there is only one way it can be implemented. However, it is occasionally the case that there are potential alternatives. The two constraints of latency and autonomy are usually used to determine the most appropriate replication type, but another potentially important variable is processing time - how does replication effect my production database, and when it reaches the subscriber, how long will it take to be processed there.Designing the TestThe table used in this research was a sample table having
fixed width columns: set nocount on declare @count smallint set @count = 6500 select getdate() while @count <= 6600 begin INSERT INTO tTestReplicationSpeed(int1, int2, int3, char1, char2, char3) VALUES (@count, @count, @count, 'xxx', 'xxx', 'xxx') set @count = @count + 1 end select getdate() ResultsThe results are shown below:
(a) Time taken to perform the inserts (b) Time taken to perform the synchronize Analysis of GraphsFor the most part the trends are quite straightforward. So, to state the obvious :) - for the inserts, merge takes significantly longer than transactional and snapshot. For small numbers of inserts transactional and snapshot are more or less the same. For larger numbers of rows, transactional begins to take longer than snapshot. For the processing graphs for small numbers of rows there’s not a lot of difference, but as the number of rows increases, transactional and snapshot stay very similar, while merge becomes much larger.Explanation of resultsThe above results themselves might help decide between replication types, all other things being equal. If you’re interested in why the results are the way they are, then I’ll explain here, taking for granted a reasonable understanding of replication internals. Snapshot replication doesn’t affect the publisher’s schema, so the table is unaltered. An insert into the snapshot table is much the same as any other insert in a non-replicated environment. For transactional replication, the schema is also not altered. However, each row that is inserted must be read from the transaction log and subsequently written to both the MSrepl_commands and MSrepl_transactions tables as each insert is an explicit transaction. So, one insert becomes at least 3 writes (not just three as history tables make this slightly more complicated). At lower throughputs this is not an issue, but if the distribution database and publication database are on the same disk and there are higher transaction volumes, you’ll see the disk queue length increasing as a result, hence the processing time difference to snapshot is only noticeable as throughput increases. Merge replication does alter the schema by adding several triggers. In our case it is relevant that there is an insert trigger placed on the tTestReplicationSpeed table which adds the additional overhead. This trigger inserts a record into MSmerge_contents, and that is the only additional data change but the insert trigger’s queries require several joins. For synchronization the same trends are noticed, but merge is much more expensive. We can see this if we look at a profiler for a single insert where there are 2664 entries listed! That is not 2664 data modifications as these are processed statements. Nevertheless, there were 18 data modifications at the subscriber involving disk writes; 2 of these were associated with the table itself (heap and index) and the remaining 16 connected with keeping the merge replication metadata up to date. |
Paul Ibison, Copyright © 2010 |