SQL 2017 Distributor Changes

 

There’s not much new stuff in SQL 2017 for the replication administrator. One thing I did notice though is the distributor properties have changed. You can see 2 new options below:

Using sp_helpdistributor (or select * FROM msdb.dbo.MSdistributiondbs) you can more clearly see that there are 2 options to control deletions.

Remember that we retain the commands and transactions in the distribution database, so we’d expect that the deletions must relate to these records? There’s no mention of these parameters in the Books Online help guide so I did a little digging to confirm where they are used.

The distribution cleanup job executes dbo.sp_MSdistribution_cleanup.

This proc , through nested procs, calls:

(1) sp_MSdelete_publisherdb_trans

DELETE TOP(@deletebatchsize_commands) MSrepl_commands WITH (PAGLOCK) from MSrepl_commands

(2) sp_MSdelete_dodelete

delete TOP(@deletebatchsize_transactions) MSrepl_transactions WITH (PAGLOCK) from MSrepl_transactions

So – we previously only had a cutoff point to use in deleting old transactions. When there are millions of commands in this table it is time-consuming and prone to blocking issues. Now we have a neat way of doing batch deletes. I’ve recently been working with some tables having a billion rows where batching and decent indexing is essential so I can see exactly why this has been introduced.

Reducing merge traffic using sp_addmergearticle

OK – there are a few ways we might try to reduce the amount of merge traffic. The most common in my experience is by using filters. I’ll do a whole other post on static and dynamic filters and join filters later on. I used them a lot in the seismic industry to ensure only data relevant to each ship is sent down. Here I introduce 2 other interesting methods controlled in sp_addmergearticle.

(1) @subscriber_upload_options

This is the easiest. We are going to look at stopping some of the traffic going from the subscriber to the publisher. In fact we’ll look at how it can be stopped from being recorded all together.

The option is exposed in the SSMS gui when editing the publication articles. Slightly confusingly it is represented in 2 ways. You can select the checkbox below:

Or you select to set the “synchronization direction”. This way you’ll see all the available options. I’ll follow the numbering in sp_addmergearticle to be consistent. Option “0” is Bidirectional and is the default we are used to. Option “2” is identical to the checkbox above. Option “1” download-only but allowing subscriber changes – is the final option. So we have 3 options and only options 1 and 2 are anything different from default merge behavior.

If you use the checkbox or options 1 or 2, you’ll notice that the table icons are grayed out showing that there is some difference to the normal merge articles – can be quite handy for a quick check.

Remember that merge is all about triggers. So triggers are again used to enforce this at the subscriber.

These 2 new behaviors are set up by (option 1) having no triggers – we allow all changes at the subscriber but don’t record them in merge metadata tables:

Or (option 2) if we prohibit subscriber changes, we have no need to record them because there aren’t any. In this case there will be a new trigger at the subscriber which has the format below – to rollback any attempted changes.

CREATE trigger [dbo].[MSmerge_downloadonly_0584B82B40C543E8A8BF02D2AD6F3A11] on [dbo].[CitiesDownloadOnlyProhibit] for update, insert, delete
not for replication
as
set nocount on

if @@trancount > 0 rollback tran
raiserror (20063, 16, -1, ‘[dbo].[CitiesDownloadOnlyProhibit]’)

return

There’s a slight gotcha with all this. You might set it up and find that it all behaves like regular Bidirectional. You recheck the article properties, script it out etc and convince yourself that it is set up as option 1 or 2 though! It’s not a bug. In this case you have set up the subscription incorrectly. Download-only articles only work if the subscription is a “Client” one. The “Server” subscription is the default though so be aware of this.

(2) @delete_tracking

Unlike the subscriber_upload_options this is one of those properties we have to set as a parameter to sp_addmergearticle as it is not exposed in the SSMS publication properties GUI. I guess it is seen as a rarely used option.

So, when delete_tracking is set to false, rows deleted at the Subscriber are not removed at the Publisher, and rows deleted at the Publisher are not removed at the Subscriber.

This is implemented slightly differently. There is still a delete trigger – it just doesn’t add a record to msmerge_tombstone – so on sync there is no record of the delete.

NB!!!

At this point it’s worth mentioning that these 2 options fly in the opposite direction of what we are normally trying to ensure – here we are creating non-convergence on purpose and when it comes to reinitialization we’ll lose these differences. Quite unlike filtering, which results in different data at the publisher and subscriber, but reinitializes the data without any losses. So we have to be aware that this is really for non-persistant data only.

Dropping Replicated Tables

I recently upgraded some of our SQL 2014 servers to SP2. When going through the list of hotfixes I came across this gem: “This article describes an improvement that allows a table that’s included as an article in transactional replication publication to be dropped from the database and the publication(s).”

You might have tried at some time to drop a replicated table. You’ll see this type of message…

Remember in SQL 2000 we used to have a similar case when altering columns on replicated tables unless we used sp_repladdcolumn and sp_repldropcolumn. Then along came the @replicate_ddl argument for sp_addpublication that made our lives easier and we could then make changes seamlessly. Well, there is now a similar possibility for dropping replicated tables:

exec sp_changepublication @publication = ‘<Publication Name>’, @property = ‘allow_drop‘, @value = ‘true’

There are some caveats to be aware of:

  • The publisher needs to be running SQL Server 2016 SP1 or 2014 SP2.
  • If the same article exists in more than one transactional publication, you have to enable this feature on all of them or you’ll still get the error that a table is in replication and can’t be dropped.
So, once you have set it up you can run the following type of statement with no errors:
drop table [dbo].[City]
Even if it is the last article in the publication and you have 100 subscribers this is still no problem.
To see if it has been enabled you can run
exec sp_helppublication ‘mainpublication’
There is a new column returned “allow_drop” which will tell you it’s there.
Unfortunately if you need to check this programatically there isn’t a nice column in syspublications which shows the value. It is hidden in the options bitmap. I suppose this was the easiest place to put it – hopefully it’ll be explicit in SQL 2017. The value to look for is
0x80 (Books Online is not yet updated with this value or the “allow_drop” in sp_helppublication (as of 13th Sep 2017)).

Dropping Transactional Articles

In MSDN in the article “Add Articles to and Drop Articles from Existing Publications” there is a section which says “For snapshot or transactional publications, articles can be dropped with no special considerations prior to subscriptions being created. If an article is dropped after one or more subscriptions is created, the subscriptions must be dropped, recreated, and synchronized.”. This is pretty scary and sounds almost like a reinitialization is needed!

Let’s see what happens when we use SSMS. If you drop an article from the publication you’ll see the following dialogue:

“This will not impact existing subscriptions” sounds reassuring, but is slightly ambiguous – perhaps it is only referring to the invalidation of the snapshot. However there is clearly no mention that “the subscriptions must be dropped, recreated, and synchronized”. If you click on “Yes” you’ll find that the next time you start the distribution agent all runs fine. There will be some orphaned objects left on the subscriber – stored procedures related to the article and the like – but nothing to worry about. So perhaps the MSDN article is incorrect?

Not really! The SSMS interface is clever enough to do the hard work for you. If you try to drop the article yourself using this type of script:

EXEC sp_droparticle
@publication = ‘pubtestarticle’,
@article = ‘tcity3’,
@force_invalidate_snapshot = 1;
GO

You’ll receive an error:

Msg 14046, Level 16, State 1, Procedure sp_MSrepl_droparticle, Line 286 [Batch Start Line 0]
Could not drop article. A subscription exists on it.

What you need to do is to manually drop the subscriptions it has and then you can drop the article.

EXEC sp_dropsubscription ‘pubtestarticle’,’tcity3′,’desktop-nsonkmo’,’transub’

This is what SSMS does behind the scenes and this is all the MSDN article is really telling us – there is no need to reinitialize, just drop the relevant subscriptions and then the article and then synchronize.

Dropping replicated tables…..impossible?

Did you ever try to drop a replicated table? Of course this is disallowed! It was always that way right?

If you do try it what you’ll see is a variation on the following message:

Well actually this has now changed and you can merrily drop the table.

It’s not enabled by default but if you run the following you’ll be permitted to do it.

EXEC sp_changepublication
@publication = ‘yourpublicationname’,
@property = N’allow_drop’,
@value = ‘true’
GO

If when you run the above you get a nasty error message as follows:

Msg 21183, Level 16, State 1, Procedure sp_MSrepl_changepublication, Line 274 [Batch Start Line 0]
Invalid property name ‘allow_drop’.

……that means that you have other problems: your SQL server is not up to date! Naughty DBA! This facility came in with SQL 2014 Service Pack 2  and SQL 2016 Service Pack 1. Sorry but nothing doing for SQL 2012/2008R2/2008!

Once it is set, you can happily drop tables in the publication database and the replication agents won’t error. The log reader will clean up the metadata related the article and ignore any remaining unprocessed commands related to the table. It also removes the article from the publication. The distribution agent processes the article records already sent to it by the log reader and when it reaches the cleanup command it has completed the processing for this article.

Note that the distribution agent doesn’t report the fact that the drop table has been processed during a synchronization. So we will not really be aware this has happened. Also the distribution agent doesn’t send the “Drop” statement down to the Subscriber either so the table is left there intact. This is quite unlike other DDL changes which keep the publisher and subscriber in sync. It’s safer this way but it’s a pity an additional option to send the delete doesn’t exist.

Finally, once it is set you’ll not see it so easily in syspublications. Another explicit column for this setting would have been nice but instead it is hidden in the logic of the options bitmap. However you can run the following to check if it is enabled:

EXEC sp_helppublication @publication = ‘yourpublicationname’

And then you’ll have a clear report of the flag setting….

 

Replication Agents: Profiles and Parameters

OK – a few pointers to agent profiles, job parameters and such….

We usually find that our replication agents function quite nicely out of the box but at some stage we’ll need to tweak them. Perhaps they need to continue when an error is found, log more troubleshooting details to a file, have a longer query timeout and so on. This can be done by using one of the inbuilt profiles or by creating a custom one and setting up the values there. Either way we then assign the profile to the agent.

When we go down this route and take a look at agent profile parameters and compare to the list of agent executable parameters here it seems there is a big difference. Many are missing in our profile! First thing is to untick the checkbox at the bottom of the profile (below) and hey presto many more appear. Still not the full list of agent parameters, but 90% are there now. I’ll mention the missing 10% in a minute.

So, we create our profile, untick the checkbox,  set values and set the agent to use it. Now maybe we have 100s of agents and many different profiles. It is quite possible that local subscriptions are optimised to use a different profile than remote subscriptions which have flaky connections. Some agents should continue on data consistency errors while others should fail and alert us. And so on. This one-to-many relationship between the profiles and agents is not represented in the Replication Monitor -we have to go into each agent to see which profile it is using. You can use the query below to see what is what:

SELECT
[publication] as Publication
,c.srvname as SubscriberName
,b.profile_name as Profile
,[name] as DistributionJobName
FROM [distribution].[dbo].[MSdistribution_agents] a
inner join msdb.[dbo].[MSagent_profiles] b
on a.profile_id = b.profile_id
inner join master..sysservers c
on a.subscriber_id = c.srvid
order by b.profile_name

This returns a recordset which lists the agents and their profiles:

Remember that above I said 90% parameters are available when we untick the checkbox on the profile? There are some others such as “Output”, “Hostname”, “AltSnapshotFolder”, “UseInprocLoader” and a few more which have to be added to the job step directly as they don’t exist in the profile.

Now with this level of complexity – agents, different profiles, job step parameters and such we have a tricky situation. When an agent runs, which profile and parameters did it use, and what job values were used? We can use my query above to help, but perhaps the job parameter values have been changed since the job was run, and the agent might have been reset to use a different profile since then anyway, or the profile itself might have been changed. You can take a look in the distribution agent history but it is not detailed enough to help us and only gives basic info. To find out what our agent really used when it ran we need to check the job history.

You can see below that the job history separates the output into 2 logical sets of parameters – “user-supplied” and “from agent profile”. The profile name itself is not mentioned here (pity) but the particular values it used are, which anyway is the main level of detail we need.

One other point to note is that in the above example I’ve highlighted in blue the QueryTimeout parameter which has a value of 7000 seconds. Why? Because as an experiment I had added it to the agent job step and it was also in the profile with a different value of 3600 seconds. Now we see that the job step parameter takes priority and overrides the profile value when they are both entered. Really we should use one or the other and not mix the 2 methods and generally I’d advise to use the profile as it is more transparent.

I look at this output to confirm the values being used are the ones I expected to see. There are a few reasons for this. I might have many agents using a particular profile and just want to change one agent on an ad-hoc basis (and remove the parameter afterwards!), or I want to investigate why my custom profile value is not being applied or the parameter value is in the 10% not available in the profile and I just want to check my input value is being used…

Replicating to Memory-Optimised Tables

In SQL 2016 you might have wondered about the strange new setting when adding a new subscription:

This is the option for “Memory Optimised” tables and applies to Transactional and Snapshot subscriptions only. Memory Optimised tables are a newish (SQL 2014 and 2016) design feature which can be really useful for reporting purposes, and this fits in nicely with the use of many Transactional Replication systems, so it’s definitely something worth taking a look. As an additional bonus in SQL 2016 it’s also available in any edition.

Just a quick clarification that we are talking about replication subscribers here, not publishers. We can’t replicate memory optimised tables from the publisher – we can only create them at the subscriber.

First of all I’ll mention that selecting the option above when adding a subscription on its own does nothing! You’ll just get a normal table on the subscriber and we need to do a little more to get this to work.

You need to take a look at each table in the publication and set the article properties there. The 2 options you’ll need are shown below.

If you set “Enable Memory Optimisation” to true and leave the second option as false which is the default, most likely this will fail. This is because usually Primary Keys are created with a clustered index but Memory Optimised tables don’t play happily with clustered indexes. If your primary keys have a non-clustered index and there are no other clustered indexes on the table you’ll be ok, but this is a pretty unlikely scenario. Anyway, the wizard will check for you and you’ll get the following error if you’ve omitted the second option and it was required.

Really there’s no reason I can think of to leave the second option set to false.

So, having set the article properties and the subscription properties we are now almost ready to roll.

One final check – is the subscriber database set up to accommodate memory optimised tables? You can take a look in SSMS or run the following query replacing the “Northwind” with your database name to see:

select * from  [Northwind].sys.filegroups where type = ‘FX’

If nothing is returned, adapt and run the following replacing “Northwind” with your database name:

ALTER DATABASE northwind ADD FILEGROUP northwind_mod CONTAINS MEMORY_OPTIMIZED_DATA
ALTER DATABASE northwind ADD FILE (name=’northwind_mod2′, filename=’c:\data\northwind_mod2′) TO FILEGROUP northwind_mod
ALTER DATABASE northwind SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON

At this stage we are good to go. Create the subscription and after initialization run the following at the subscriber to check for any memory optimised tables.

SELECT OBJECT_NAME(object_id) as TableName, * FROM sys.dm_db_xtp_table_memory_stats;  

All being well you’ll get a row returned like the following and we’re done :).

 

Credentials and Proxies and why bother using them?

When we are setting up a subscriber there is a screen where we decide on the agent security.

The three dots ellipsis button above takes us to a detailed setup form. Most often I think DBAs select the highlighted option in the security form below.

Certainly the installations I’ve seen imply that this has always been chosen. This was the only possible setup used in SQL 2000 and despite the warning that “this is not a recommended security best practice” it is still definitely the easiest option.

In this scenario the distribution.exe or merge.exe is being run by the windows account that SQL Server Agent is set to use. This account is impersonated to give access to the distributor and the subscriber. Often the SQL Agent account is a sysadmin on the publisher and a local windows administrator. Sometimes this exact same account is also a sysadmin and local admin on the subscriber and in some extreme cases it is a domain admin. I’m not recommending any of these – just pointing out what seems commonplace. This is the crux of the issue as Microsoft always recommend using a least privilege user account for services.

But why bother with this least privilege philosophy for replication? Well, things can be run through the replication engine – nasty things. The most obvious example is if you take a look at sp_addscriptexec. You have to be db_owner to run it but essentially a db_owner can become sysadmin through this means. Another example would be all those text files in the snapshot directory. Another would be the DDL change scripts. What this really amounts to is the possibility to implement SQL-injection via replication.

So – if you really want to lock it all down and ensure that the replication engine cannot do too much harm in the hands of a malicious user we need to consider using credentials and proxies and get away from impersonating the SQL Agent account. A credential refers to details of a windows user and password. This has reduced privileges compared to the SQL Agent service account. If the credential account and the SQL Agent account have the same level of access to SQL and to the OS then you are going down the wrong street!

A proxy is a reference to the credential to be used for a particular function – “Replication Distributor” above. You might want the same credential to be used for merge and transactional agents – so in this case one credential and 2 proxies. Both credentials and proxies are easy to set up in SSMS (above) and you’ll just need to create the credential first and then the proxy.

Once this is done the job step that runs the particular subscriber agent will have another entry in the drop down – the name of the Proxy account that you added. In the case below I select the second job step of the distribution job to use the “DistributorProxy” proxy account (sorry about my naming convention!).

You might wonder how this all relates to the option below in the subscription wizard which has the option to state an alternative Windows account. Seems similar?

Well, this does all the Proxy/Credential/Job configuration for you in the background! The only problem is that if you leave it to the wizard you’ll end up with a mess of duplicates and horrible proxy and credential names like the one below, so it’s definitely worth designing it yourself.