| |
Replication Processing Times Compared
Introduction
This is a simple comparison of the times taken to process different types of replication. Sometimes there is a business requirement that obviously seems suitable for replication and there is only one way it can be implemented. However, it is occasionally the case that there are potential alternatives. The two constraints of latency and autonomy are usually used to determine the most appropriate replication type, but another potentially important variable is processing time - how does replication effect my production database, and when it reaches the subscriber, how long will it take to be processed there.
The test
The table used in this research was a sample table having fixed width columns:
The fixed length datatypes were chosen because this table was also used to calibrate network bandwidth issues which are not the subject of this article. Essentially this is a simple, straightforward table.
The questions I wanted to examine were:
How does insert time vary with replication type?
Once inserted, how does propagation time vary with replication type?
The time taken to move data across the network was removed as a variable by doing the replication on one box - I was concerned with processing only. All other interfering services were suspended. The publisher, distributor and subscriber databases files and log files were expanded to avoid any autogrowth issues, and were in full recovery move to avoid any auto-truncation. The processing was repeated until nearly constant values were obtained, to rule out caching issues. The cleanup agent was disabled and only run manually after each test. I fully acknowledge the raw nature of this type of 'experiment', and the fact that several important criteria were not examined, including the fact that no attempt was made to optimize the agents by creating custom profiles, and the effect of row width was not looked at. This was not an attempt to do a huge comprehensive analysis of the variables, more of a simple comparison was aimed at.
To perform the inserts, the following script was used where each record inserted comprises a separate transaction.
set nocount on
declare @count smallint
set @count = 6500
select getdate()
while @count <= 6600
begin
INSERT INTO tTestReplicationSpeed(int1, int2, int3, char1, char2, char3)
VALUES (@count, @count, @count, 'xxx', 'xxx', 'xxx')
set @count = @count + 1
end
select getdate()
Results
The results are shown below:
| # rows |
Snapshot Inserts |
Sync Time |
Transactional Inserts |
Sync Time |
Merge Inserts |
Sync Time |
| 100 |
0 |
4 |
0 |
0 |
1 |
2 |
| 500 |
1 |
4
|
1 |
1 |
3 |
6 |
| 1000 |
2 |
5 |
2 |
1 |
5 |
12 |
| 5000 |
6 |
7 |
7 |
2 |
21 |
42 |
| 10000 |
13 |
12 |
26 |
5 |
42 |
96 |
(a) Time taken to perform the inserts

(b) Time taken to perform the synchronize

Analysis of Graphs
For the most part the trends are quite straightforward. So, to state the obvious :) - for the inserts, merge takes significantly longer than transactional and snapshot. For small numbers of inserts transactional and snapshot are more or less the same. For larger numbers of rows, transactional begins to take longer than snapshot. For the processing graphs for small numbers of rows there’s not a lot of difference, but as the number of rows increases, transactional and snapshot stay very similar, while merge becomes much larger.
Explanation of results
The above results themselves might help decide between replication types, all other things being equal. If you’re interested in why the results are the way they are, then I’ll explain here, taking for granted a reasonable understanding of replication internals.
Snapshot replication doesn’t affect the publisher’s schema, so the table is unaltered. An insert into the snapshot table is much the same as any other insert in a non-replicated environment. For transactional replication, the schema is also not altered. However, each row that is inserted must be read from the transaction log and subsequently written to both the MSrepl_commands and MSrepl_transactions tables as each insert is an explicit transaction. So, one insert becomes at least 3 writes (not just three as history tables make this slightly more complicated). At lower throughputs this is not an issue, but if the distribution database and publication database are on the same disk and there are higher transaction volumes, you’ll see the disk queue length increasing as a result, hence the processing time difference to snapshot is only noticeable as throughput increases. Merge replication does alter the schema by adding several triggers. In our case it is relevant that there is an insert trigger placed on the tTestReplicationSpeed table which adds the additional overhead. This trigger inserts a record into MSmerge_contents, and that is the only additional data change but the insert trigger’s queries require several joins. For synchronization the same trends are noticed, but merge is much more expensive. We can see this if we look at a profiler for a single insert where there are 2664 entries listed! That is not 2664 data modifications as these are processed statements. Nevertheless, there were 18 data modifications at the subscriber involving disk writes; 2 of these were associated with the table itself (heap and index) and the remaining 16 connected with keeping the merge replication metadata up to date.
|
|