Standby Servers: Log-Shipping vs
Replication
Introduction
Many previous articles on log-shipping and replication
have concentrated on solving set-up and configuration issues. This article
outlines the differences between them, for the purposes of maintaining a standby
server. To put this in context,l og-shipping and replication are two methods
which are often compared to clustering, however clustering is really a
technology created to solve a different business requirement - that of automatic
fail-over. In solving this requirement it has some intrinsic restrictions: the
distance between nodes is extremely limited meaning the entire system must
be in the same geographic location, and the data itself physically resides in
one place. This latter point sounds obvious, but consequently there is no
possibility of using the topology to distribute query load e.g. to have a
reporting server. On the more prosaic side it also has a reputation as being
relatively difficult to set up and maintain and the licenses required are not
cheap. So, many DBAs take advantage of their existing skill set to implement a
methodology which maintains a standby server, accept that fail-over will be a
manual process but hope to gain the bonus of having standalone reporting
capability. Commonly the choice is between log-shipping and replication, but one
needs a clear understanding of which methodology is the more appropriate. The
table below outlines some of the key differences, which are then explained
further.
Key Differences
| |
Log-Shipping |
Replication |
| What is the latency? |
>1min |
Potentially as low as a few seconds |
| Is the schema altered at the
publisher? |
No |
Snapshot - no
Transactional - no
Updating subscribers or merge - yes |
| Is the schema altered at the
subscriber? |
No |
Possibly (see text) |
| Are there schema requirements? |
None |
Primary keys required for transactional
table articles |
| Can we select individual articles? |
No |
Yes |
| Is the subscriber/standby database
'protected'? |
Yes |
No |
| Is system data transferred? |
Mostly |
No |
| Is the subscriber/standby server
useable as reporting server? |
Unlikely (see text) |
Yes |
What is the latency?
Log-shipping can backup once every minute and the copy and
load frequency can also be every minute. If you use transactional replication or
merge replication, the latency can be as low as a few seconds, assuming the
relevant -POLLINGINTERVAL agent parameters are minimized. Snapshot replication
will have a much higher latency as it requires an exclusive lock on the
publisher's replicated tables - consequently it is often considered an offline
solution.
Is the schema altered at the publisher?
Log-shipping and snapshot replication do not alter the
publisher's schema. Updating subscribers (transactional and snapshot) and merge
replication will add a guid column if there isn't one there already with the
rowguid property. This may cause some queries on the publisher to fail, e.g. if
you had TSQL inside a stored procedure which did the following:
INSERT INTO ExistingTable
SELECT * FROM ReplicatedTable
Is the schema altered at the subscriber?
Log-shipping makes no alterations to the schema. Snapshot
and transactional replication may make subtle schema changes; standard
transactional and snapshot will not transfer identity attributes - they become
normal numerical columns (int, smallint, numeric...) on the subscriber. Some
DBAs try to get round this by doing a nosync initialization and ensuring the
table on the subscriber has the Identity 'Yes (Not for Replication)' attribute,
which essentially allows the replication process to do identity inserts.
However, on use of the fail-over server this methodology fails as the internal
identity counter has not been incremented, and use of DBCC CHECKIDENT to reseed
will not always work on columns created with this attribute (see BOL). This
problem is not apparent if merge or queued updating subscribers are selected.
Are there schema requirements?
There are no requirements for log-shipping whereas all
forms of transactional replication require primary keys on table articles.
Can we select individual articles?
Log-shipping takes the whole set of tables, stored
procedures, triggers etc existing in the published database, while replication
has the granularity to select articles individually.
Is the Standby database 'protected'?
Log-shipping restores the logs to the standby server with
NoRecovery or with Standby - both options disallow edits to be made to the data.
Replication doesn't enforce such a restriction so explicit permissions would be
required to prevent changes to subscriber data.
Is system data transferred?
A front-end application usually requires the presence of
some system data. What I am referring to here is data in the system catalog or
the database catalog: logins, messages, users, permissions etc. In the case of
log-shipping, the database catalog is transferred but not the system catalog,
while replication transfers neither.
Is the subscriber/standby server useable as reporting
server?
Log-shipping can't usually allow the standby server to be
used for reporting as it requires an exclusive lock on the database to restore
the log i.e. during the restore users will be forcibly removed, or the
log-shipping job will fail. Replication can allow reporting queries to be run,
and as typically concurrent replication is not a big resource drain, such
queries usually run quickly.
Conclusions
Hopefully this article provides enough info to make a more
informative choice. Although this is a little simplistic, generally I recommend
log-shipping for fail-over, followed by transactional replication with queued
updating subscribers. The order is reversed if there is a possibility of needing
to go back to the production server once it is repaired (in which case one can
simply run the queue reader agent) or if there is a requirement for a standby
server to function additionally as a reporting server.
|