Columnstore Index Performance: Column Elimination

December 31, 2016, 7:16 pm

≫ Next: Columnstore Index Performance: Rowgroup Elimination

≪ Previous: A walkthrough of Loan Classification using SQL Server 2016 R Services

Data in Columnstore index is stored as columns and each column is stored and accessed independently of other columns unlike rowstore where all columns in a table are stored together. This allows SQL Server to fetch only the columns referenced in the query. For example, if a FACT table has 50 columns and the query only accesses 5 columns, only those columns would need to be fetched. Assuming have equal length, clearly a radical assumption, accessing data through columnstore index will reduce IO by 90% in addition to the significant data compression achieved with columnstore index. Since data is read compressed in SQL Server memory, you get the similar savings for SQL Server memory.

Let us consider a simple example to illustrate these points. I have created the following two tables, one (CCITEST) with clustered columnstore index and other (CITEST) with a regular clustered index as shown in the picture below

column-elimination-schema

Now, I inserted identical 11 million rows each into these tables. Now, I will run the same set of queries, one that aggregates all the columns and one that aggregates only one column. These queries are run on both of these tables.

The picture below shows the logical IOs done on rowstore table and as expected, the number of logical IOs done are same irrespective of number of columns referenced in the query. column-elimination-rowstore

Now, let us run the same two queries on the table with clustered columnstore index as shown in the picture below. Note, that the logical IOs for the LOB data is reduced by 3/4^th for the second query as only one column needs to be fetched. You may wonder why LOB? Well, with columnstore index, the data in each column is compressed and then is stored as LOB. Another point to note is that the query with columnstore index runs much faster, 25x for the first query and 4x for the second query when compared with the same queries running on rowstore.

column-elimination-cci

Column elimination speeds up analytics by reducing IO and memory consumption for common schema patterns. For example, in Star Schema pattern, the FACT table is typically very wide containing large number of columns. With columnstore index, only the referenced columns would need to be fetched.

Thanks

Sunil

↧

Columnstore Index Performance: Rowgroup Elimination

January 1, 2017, 4:09 pm

≫ Next: SQL Server Performance Dashboard Reports unleashed for Enterprise Monitoring !!!

≪ Previous: Columnstore Index Performance: Column Elimination

As described in Column Elimination , when querying columnstore index, only the referenced columns are fetched. This can potentially reduce IO/memory footprint of analytics queries significantly and speed up the query performance. While this is good, the other challenge with columnstore index is how to limit the number of rows to be read to process the query. As described in the blog why no key columns , the columnstore index has no key columns as it would be prohibitively expensive to maintain the key order. This can impact the analytics query performance significantly if a full scan of a columnstore index containing billions of rows is needed to apply range predicates. For example, if a FACT table stores sales data for the last 10 years and you are interested in the sales analytics for the current quarter, it will be more efficient if SQL Server scans the data only for the last quarter instead of scanning the full table, a reduction of 97.5% (1 our 40 quarters) both in IO and query processing. This is easy with rowstore where you can just create a clustered btree index on the SalesDate and leverage it to scan only the rows for the current quarter but what about columnstore index? One way to get around this is to partition the table by quarter or week or day which can then reduce the number of rows to be scanned significantly. While this works but what happens if you need to filter the data by region in a large partition? Scanning the full partition can be slow. With rowstore, you could partition the table by quarter and keep the data sorted within the partition by creating a clustered index on region. This is just one example but you get the idea that unordered data within columnstore index may cause scanning larger number of rows than necessary. In SQL Server 2016, you can potentially address this using NCI but only if the number of qualifying rows is small.

Columnstore index solves this issue using rowgroup elimination. So what exactly is a rowgroup? The picture below shows how data is physically organized both for clustered and nonclustered columnstore indexes. A rowgroup represents a set of rows, typically 1 million, that are compressed as a unit. Each column within a rowgroup is compressed interpedently and is referred to as a segment. SQL Server stores the min/max value for each segment as part of the metadata and uses this to eliminate any rowgroups that don’t meet the filter criteria. columnstore-structure

In the context of rowgroup elimination, let us revisit the previous example with sales data

You may not even need partitioning to filter the rows for the current quarter as rows are inserted in the SalesDate order allowing SQL Server to pick the rowgroups that contain the rows for the requested date range.
If you need to filter the data for a specific region within a quarter, you can partition the columnstore index at quarterly boundary and then load the data into each partition after sorting on the region. If the incoming data is not sorted on region, you can follow the steps (a) switch out the partition into a staging table T1 (b) drop the clustered columnstore index (CCI) on the T1 and create clustered btree index on T1 on column ‘region’ to order the data (c) now create the CCI while dropping the existing clustered index. A general recommendation is to create CCI with DOP=1 to keep the prefect ordering.

SQL Server provides information on the number of rowgroups eliminated as part of query execution. Let us illustrate this using an example with two tables ‘CCITEST’ and CCITEST_ORDERED’ where the second table is sorted on one of the columns using the following command
create clustered columnstore index ccitest_ordered_cci on ccitest_ordered WITH (DROP_EXISTING = ON, MAXDOP = 1)

The following picture shows how the data is ordered on column 3. You can see that data for column_id=3 is perfectly ordered in ‘ccitest_ordered’.

rowgroup-elimination-1

Now, we run a query that uses column with column_id=3 as a range predicate as shown below. For CCITEST table, the data was not sorted on the column OrganizationKey, no rowgroup was skipped but for the table CCITEST_ORDERED, 10 rowgroups were skipped as SQL Server used the Min/Max range to identify rowgroups that qualify. rowgroup-elimination-2

You may wonder why it says ‘segment’ skipped and not ‘rowgroups’ skipped? Unfortunately, this is a carryover from SQL Server 2012 with some mix-up of terms. When running analytics queries on large tables, if you find no or only a small percentage of rowgroups were skipped, you should look into why and explore opportunities to address if possible

Thanks

Sunil

↧

SQL Server Performance Dashboard Reports unleashed for Enterprise Monitoring !!!

January 2, 2017, 1:40 am

≫ Next: Columnstore Index Performance: BatchMode Execution

≪ Previous: Columnstore Index Performance: Rowgroup Elimination

SQL Server 2012 Performance Dashboard Reports is one of most popular SQL Server monitoring solution for customers and SQL community leveraging dynamic management views (DMVs) for monitoring and reporting and available at no cost to consumers. SQL Server Performance Dashboard Reports are available as a set of custom reports in SQL Server Management Studio (SSMS) which runs against the connected instance in Object Explorer. When monitoring large enterprise deployments of SQL Server, hosting SQL Server Performance Dashboard Reports on a central reporting server can provide additional benefits making life easier for enterprise DBAs for monitoring and troubleshooting SQL Server issues. To support hosting SQL performance dashboard reports on a central SQL Server Reporting Services instance, we have customized SQL Server 2012 Performance Dashboard Reports, added new reports and uploaded in Tiger toobox github repository for customers and SQL community. The reports are tested to run against SQL Server 2012, SQL Server 2014 and SQL Server 2016 versions of target SQL Server instance and can be deployed against SQL Server 2012, SQL Server 2014 or SQL Server 2016 Reporting Services instance.

Following are some of the benefits of hosting SQL Performance dashboard reports on central SSRS reporting server.

Monitoring Reports accessible anytime, anywhere using browser – This removes the dependency of thick client like SQL Server Management Studio (SSMS) to be present on the workstation server allowing DBAs, DevOps audience to check the health of SQL Server and resource consumption using web browser from any workstation machine with access to the server.
Scheduling automatic report delivery – SSRS allows scheduled email or file share delivery of reports. This allows DBAs, application owners and database stakeholders to choose push model where by performance health reports can be scheduled to run against specified SQL Server instances at the specified time and be delivered in their mailbox to proactively monitor overall health of SQL Server instance and detect any anomaly.
Performance Base lining using Report Snapshots – SSRS allows you to capture scheduled point in time report snapshots at the specified time interval allowing DBAs to establish performance baselines using historical snapshots for the target SQL Server instances.
Linked Reports for Application owners and other stakeholders – In an enterprise environment, most application teams and stakeholders are interested to see the performance, resource consumption, blocking information and overall health of their SQL Server instance on-demand. In such scenarios, DBAs can create linked reports for the target SQL Server instances on the SSRS central server and delegate them permissions to view reports for their target SQL Server instance of interest. This allows application teams, developers to be self-sufficient to check the overall health of their SQL Server instances creating some bandwidth for DBAs who needs to be contacted only if there is an anomaly or problem detected.

Architecture

The following diagram shows high level architecture when deploying SQL Performance Dashboard Reports on a central monitoring SSRS server instance for monitoring all the target SQL Server instances in an enterprise or mid-size deployments of SQL Server.

Setting Up and Configuring SQL Server Dashboard Reports for Monitoring

The following section provides the steps for setting up and configuring SQL Server Dashboard Reports for monitoring.

Install and configure SQL Server Reporting service (any version greater than SQL Server 2012 with latest SP and CU) on a server identified as a Central Monitoring Server. The central monitoring server should be part of the same domain and network as the target SQL Server instance.
Download SQL Performance Dashboard Reporting Solution from Tiger toobox github repository.
Download SSDT-BI for Visual Studio 2012 or Download SSDT-BI for Visual Studio 2013 and install BI designer on workstation where github solution is downloaded or copied.
Open PerfDashboard solution using Visual Studio 2012 or 2013 on the workstation and deploy it against the SQL Server Reporting service instance by providing the TargetServerUrl as shown below
Make sure report deployment is successful and browse the report manager url to see the reports deployed under SQL Server Performance Dashboard folder.
Run setup.sql script from Tiger toobox github repository against all the target SQL Server instances which creates a schema named MS_PerfDashboard in msdb database of SQL Server instance. All the relevant objects required for SQL performance dashboard reports are contained in MS_PerfDashboard schema.
You should always start with performance_dashboard_main report as a landing page and navigate to other reports from the performance dashboard report. If you have deployed the reports against SQL Server 2016 Reporting services instance, you can set performance_dashboard_main report as favorite for easier navigation as shown below.

When you browse performance_dashboard_main report, it will ask you the target SQL Server instance against which you wish to see the report. If setup.sql is ran against the target SQL Server instance, you will see the data populated in the report.

You can further click on the hyperlinks to navigate to that report for further drill through as shown below.

All the reports use Windows authentication to connect to the target SQL Server instance so if browsing user is part of a different domain or do not have login or VIEW SERVER STATE permissions, the reports will generate an error. Further, this solution relies on Kerberos authentication as it involves double hop (client -> SSRS server -> target SQL instance), so it is important that target SQL Server instances have SPNs registered. The alternative to Kerberos authentication is to use stored credentials in the report which helps bypass double hop but is considered less secure.

If you have also deployed the SQL Performance Baselining solution and System Health Session Reports from Tiger toobox github repository, you can use the same central SSRS server for hosting all the reports and running it against target SQL Server instances as shown below. The SQL Performance Baselining solution can be useful to identify the historical resource consumption, usage and capacity planning while SQL performance dashboard reports and System health session reports can be used for monitoring and point in time troubleshooting.

Parikshit Savjani
Senior Program Manager (@talktosavjani)

↧

Columnstore Index Performance: BatchMode Execution

January 2, 2017, 4:10 pm

≫ Next: MSSQLTIGER and January 2017 PASS Virtual Conferences

≪ Previous: SQL Server Performance Dashboard Reports unleashed for Enterprise Monitoring !!!

In the blog Industry leading analtyics query performance, we had looked into how SQL Server delivers superior analytics query performance with columnstore index. Besides significant reduction in IO, the analytics queries get order of magnitude better performance boost with BatchMode processing, a unique value proposition in SQL Server. The basic idea of batch mode processing is to process multiple values, hence the term ‘batch’, together instead of one value at a time. Batch mode processing is perfectly suited for analytics where a large number of rows need to be processed, for example, to compute aggregates or apply filter predicates. While the performance gains will depend upon the operator, data and the schema, we have measured the speedup up to 300x in some of our internal tests.

Though there are cases where BatchMode processing can boost query performance in transactional (OLTP) workload but at this time, the BatchMode processing is only supported in queries that reference one or more objects with columnstore index. Itzik an-Gan has listed some clever tricks that you can do to force BatchMode execution on rowstore to get pretty amazing query performance.

Before diving deeper into BatchMode processing, let us first look into how query processing is done in RowMode using a very simple query that outputs top 1200 rows satisfying a simple predicate on the table FactResellerSalesXL table in AdventureWorks2016CTP3 database. Key points (circled in red) from the following picture are

Query was executed in ROW mode. The predicate was applied to 3580 rows, one row at a time, to get 1200 qualified rows
The estimated CPU cost is 12.368 units. The actual CPU cost was 15 ms (I got this from Messages pane in SSMS) SQL Server Execution Times: CPU time = 15 ms, elapsed time = 312 ms

Now, let us execute the same query on a table with clustered columnstore index as shown in the following picture. The key points (circled in red) are

The query (specifically SCAN operator) was executed in BatchMode. There were two batches, each containing 900 rows, were executed. The predicate was applied to 900 rows at one time. The two batches were executed
The estimated CPU time is 1.2338 which is estimated to be 10x faster. To make this number real, here is the actual execution time is < 1 ms (the minimal time that can be measured), so the rowstore execution took 15x more CPU compared to BatchMode execution.

SQL Server Execution Times: CPU time = 0 ms, elapsed time = 196 ms

batchmode-2

You may be wondering what is this magic number 900 rows within a batch? Well, when executing a query in BatchMode, SQL Server allocates a 64k bytes structure group the rows. The number of rows in this structure can vary between 64 to 900 depending upon number of columns selected. For the example above, there are two columns that are referenced and X marks the rows that qualified in the BatchMode structure. If SCAN is part of a bigger query execution tree, for example, the pointer to this structure is passed to the next operator for further processing. Not all operators can be executed in BatchMode. Please refer to Industry leading analtyics query performance for details on BatchMode Operators

batchmode-3

SQL Server automatically chooses to execute supported operators in BatchMode. A query execution plan can have a mix of BatchMode/RowMode operators with automatic conversion between the modes. I highly recommend you to look the actual execution plans to validate that the supported operators are indeed getting executed in BatchMode and if not, please take a corrective action. For example, the Nested Loop or Merge Join operators are not executed in BatchMode. If in a rare case, the query optimizer has chosen these, you can get around this by hinting Hash Match execution.

Thanks,

Sunil Agarwal

SQL Server Tiger Team

Twitter | LinkedIn

↧

MSSQLTIGER and January 2017 PASS Virtual Conferences

January 2, 2017, 4:50 pm

≫ Next: Columnstore Index Performance: SQL Server 2016 – Multiple Aggregates

≪ Previous: Columnstore Index Performance: BatchMode Execution

In 2016, The Tiger team presented multiple session on the PASS Virtual Chapters on new features, In-memory technologies, query tuning, high availability, disaster recovery, monitoring, security, powershell and SQL performance base lining solutions enabling and empowering DBAs, developers, DevOps, architects and SQL community to be successful in their role. In 2017, the Tiger team is committed to continue the trend and contribute to the SQL Server community by presenting in popular forums and community events. The Tiger team will be delivering another set virtual chapter sessions for DBA Fundamentals and In-Memory Virtual chapters in January which are mentioned below. Stay tuned to our blog to find out about upcoming sessions.

How SQL Server 2016 SP1 Changes the Game

Monday, January 9th, 2017 at 6:30PM PST

RSVP: https://attendee.gotowebinar.com/register/5242400352309938690

With the recent release of SQL Server 2016 SP1 providing a consistent programming surface area has generated quite a buzz in the SQL Server community. SQL Server 2016 SP1 allows businesses of all sizes to leverage full feature set such as In-Memory technologies on all editions of SQL Server to get enterprise grade performance. This talk focuses on the new improvements, new limits on the lower editions, differentiating factors and key scenarios enabled by SQL Server 2016 SP1 which makes SQL Server 2016 SP1 an obvious choice for the customers. Come and attend this session to learn about these exciting new improvements announced with SQL Server 2016 SP1 to ensure you are leveraging them to maximize performance and throughput of your SQL Server environment.

SQL Server In-Memory OLTP: What Every SQL Professional Should Know

Wednesday, January 18th, 2017 at 9:00am PST

RSVP: https://attendee.gotowebinar.com/register/3096088783652157444

Perhaps you have heard the term “In-Memory” but not sure what it means. If you are a SQL Server Professional then you will want to know. Even if you are new to SQL Server, you will want to learn more about this topic. Come learn the basics of how In-Memory OLTP technology in SQL Server 2016 and Azure SQL Database can boost your OLTP application by 30X. We will compare how In-Memory OTLP works vs “normal” disk-based tables. We will discuss what is required to migrate your existing data into memory optimized tables or how to build a new set of data and applications to take advantage of this technology. This presentation will cover the fundamentals of what, how, and why this technology is something every SQL Server Professional should know.

Parikshit Savjani
Senior Program Manager (@talktosavjani)

↧

Columnstore Index Performance: SQL Server 2016 – Multiple Aggregates

January 11, 2017, 7:17 pm

≫ Next: Columnstore Index Performance: SQL Server 2016 – No Performance Cliff

≪ Previous: MSSQLTIGER and January 2017 PASS Virtual Conferences

SQL product team has made significant improvements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Customers will get these benefits automatically with no changes to the application when they upgrade the application to SQL Server 2016.

The examples here are based on AdventureWorksDW2016CTP3 database that you can download from here. For each example, I will run the query in SQL Server 2014 and then contrast that with SQL Server 2016 and to reinforce the point that you get perfomance improvments without requiring any changes to your query or the workload.

SQL Server 2014

Let us consider the following query. All three operators, SCAN on columntore index and the two HASH MATCH operators are automatically run in BATCH mode. The picutre below shows the execution details of one of the HASH MATCH operator.

double-aggregate-1-2014

Now, let us change the query little bit by adding one more aggregate as circled in the Redbox below. When this query was run, the execution plan got lot more complicated and HASH MATCH operators were executed in ROW mode as shown below. However, the SCAN of columnstore index was still executed in BATCH mode.

Though not shown in the picture here, the query took 15x longer to execute. The reason for this slowdown was that SQL Server 2014 did not process multiple aggregates optimally. double-aggregate-2-2014

SQL Server 2016

This issue has been addressed in SQL Server 2016. I ran the query with double aggregates ‘as is’ with SQL Server 2016 as shown in the picture below. Note, that the query plan is essentially unchanged irrespective of number of aggregates computed with HASH MATCH operator executing in BATCH mode. double-aggregate-2-2016

Though not shown in the picture above, this query ran 16x faster when compared with SQL Server 2014. This improvement in query performance is only available with database compatibility mode 130. Since 130 compatibility mode uses new cardinality estimator (CE) by default. We have seen few customer cases where new CE caused query performance regressions. You can work around that by forcing SQL Server via a Trace Flag 9481 or providing ‘use hints’ as described in SQL Server 2016 SP1 to use old CE

Thanks,

Sunil Agarwal

SQL Server Tiger Team

Twitter | LinkedIn

↧

Columnstore Index Performance: SQL Server 2016 – No Performance Cliff

January 13, 2017, 7:19 pm

≫ Next: Columnstore Index Performance: SQL Server 2016 – Aggregate Pushdown

≪ Previous: Columnstore Index Performance: SQL Server 2016 – Multiple Aggregates

SQL product team has made significant improvements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Customers will get these benefits automatically with no changes to the application when they upgrade the application to SQL Server 2016.

The examples in this blog series are based on AdventureWorksDW2016CTP3 database that you can download from here. For each example, I will run the query in SQL Server 2014 and then contrast that with SQL Server 2016 and to reinforce the point that you get performance improvements without requiring any changes to your query or the workload.

SQL Server 2014

In SQL Server 2014, BATCH mode Batch Mode execution mode was only supported when the query was executed with DOP > 1. The basic premise was that customer running big analytics query will run it on multi-core machines with high degree of parallelism. While this is all good, but imagine you are running your analytics workload on a Server with 16 cores and everything is running smoothly. Let us say the concurrent workload spikes, SQL Server can choose to decrease the DOP automatically. In the extreme case, it is possible that the queries get executed with DOP=1. As you can expect, running query single threaded will increase the query execution time proportionally. However, the impact on queries using columnstore index is a lot more severe because SQL Server 2014 will also switch to Row Mode execution. Its like double jeopardy – your query slowed down both because of single threaded execution as well as Row Mode execution. This is what I refer to as Performance Cliff, a sudden significant drop in the query performance.

In the example below, we run a simple aggregate query on a SQL Server where MAXDOP has been configured to 0 which allows SQL Server to choose available DOP. As you notice, that this query was run in BatchMode with 4 threads. The query execution time was

CPU time = 626 ms, elapsed time = 354 ms.

perfcliff-1

Now, let me run the same query but specify DOP=1 explicitly to force single threaded execution. As shown the picture below, the query runs single threaded in RowMode. For this case, the query execution time increased by 8 times; from 354ms to 2617ms.

SQL Server Execution Times:

CPU time = 2390 ms, elapsed time = 2617 ms.

perfcliff-2

SQL Server 2016

Now, let us run the same query ‘as is’ in SQL Server 2016 on the same machine. The picture below show the query run single threaded by explicitly forcing DOP=1. Note, the query was executed in BatchMode as shown so no more Performance Cliff!!

The query execution time on SQL Server 2016 was much much faster with single threaded execution than what we saw on SQL Server 2014 even with multi-threaded execution. The key reason for this was another performance optimization, aggregate pushdown , in SQL Server 2016.

SQL Server Execution Times:

CPU time = 110 ms, elapsed time = 315 ms.

perfcliff-3

As you can see that with SQL Server 2016, there is no more performance cliff and your queries will continue to leverage BatchMode execution irrespective of degree of parallelism. You will need DBCompat 130 (i.e. SQL Server 2016 default compatibility level) https://msdn.microsoft.com/en-us/library/bb510680.aspx for this optimization.

Thanks,

Sunil Agarwal

SQL Server Tiger Team

Twitter | LinkedIn

↧

Columnstore Index Performance: SQL Server 2016 – Aggregate Pushdown

January 14, 2017, 5:25 pm

≫ Next: Columnstore Index Performance: SQL Server 2016 – String Predicate Pushdown

≪ Previous: Columnstore Index Performance: SQL Server 2016 – No Performance Cliff

SQL product team has made significant implements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Please refer to previous blog No performance cliff in this series for details.

Aggregates are a very common construct in analytics queries. For example, you may want to aggregate sales per quarter for each products you sell. With columnstore index, SQL Server processes aggregate in BatchMode thereby delivering order of magnitude better performance when compared to rowstore. SQL Server 2016 takes the aggregate performance to the next level by pushing aggregate computations to the SCAN node. Yes, this improvement is on top of BatchMode execution.

The picture below shows and aggregate query that processes 10 million rows and computes a single aggregate SUM of Quantity sold from the SALES table SELECT SUM (Quantity) FROM SALES

SQL Server 2014 scans these 10 million rows in batches (e.g. 900 rows) and sends these batches to Aggregate operator to compute the aggregate. The picture below shows 10 million rows moving from SCAN node to the Aggregate node.

aggregate-1

In SQL Server 2016, the aggregate operator itself is pushed to the SCAN node (i.e. closer to the source of the data) and the aggregate is computed there for compressed rowgroups. The picture below shows that 0 rows moved from SCAN node to the AGGREGATE node. This is because the aggregate was computed at the SCAN node. The dotted line shows the computed aggregate, 1 row in this case, was sent internally to the output of AGGREGATE node. Couple of important points to note

The structure of Query plan structure is identical between SQL Server 2014 and SQL Server 2016 and the main difference is how the rows are processed. For this reason, Aggregate pushdown optimization is available across all database compatibility levels.
Aggregate push down is only done for the rows in compressed rowgroup. The rows in delta store will flow from SCAN node to the AGGREGATE node like before

aggregate-2

Let us now look at a concrete example that contrasts the aggregate processing between SQL Server 2014 and SQL Server 2016.

SQL Server 2014

The picture below shows a aggregate query and its execution plan. Note that all 11 million row flow from SCAN node to the AGGREGATE node.

The query execution time for this query was as follows

SQL Server Execution Times: CPU time = 547 ms, elapsed time = 389 ms.

aggregate-3

SQL Server 2016

The same query was run ‘as is’ on SQL Server 2016 and as the picture below shows the aggregate computation was indeed done at the SCAN node. The query execution time for this case was 2x lower than what we saw with SQL Server 2016. A more interesting number is the CPU time which was 3x lower than SQL Server 2014. In actual production workloads, we have seen much more dramatic performance gains.

SQL Server Execution Eimes CPU time = 171 ms, elapsed time = 167 Ms.

aggregate-4

Other interesting thing to notice is that the SCAN node has a property to show the number of rows that were aggregated locally. As expected for the example above, as shown in the picture below, all the rows were aggregated locally

aggregate-41

To show that Aggregate pushdown optimization is not a available for rows in delta rowgroup, I copied 50k rows into new table temp_cci with clustered columnstore index and then I ran the same aggregate query as shown in the picture below. Note, that all 50k rows are flowing from SCAN node to AGGREGATE node

aggregate-5

In summary, Aggregate Pushdown will give you the performance boost automatically when you upgrade your application to SQL Server 2016 requiring no changes to your query. there are some restrictions as described below

Supported aggregate operators are MIN, MAX, SUM, COUNT, AVG
Any datatype <= 64 bits is supported. For example, bigint is supported as its size is 8 bytes but decimal (38,6) is not because its size is 17 bytes. Also, no string types are supported
Aggregate operator must be on top of SCAN node or SCAN node with group by

Thanks,

Sunil Agarwal
SQL Server Tiger Team
Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam

↧

Columnstore Index Performance: SQL Server 2016 – String Predicate Pushdown

January 15, 2017, 5:04 pm

≫ Next: Columnstore Index Performance: SQL Server 2016 – Window Aggregates in BatchMode

≪ Previous: Columnstore Index Performance: SQL Server 2016 – Aggregate Pushdown

SQL product team has made significant implements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Please refer to one of the previous blogs No performance cliff for details.

The topic of this blog is how SQL Server columnstore index processes string data types. When designing a Data Warehouse, a general recommendation is to use non-string columns to filter rows or to apply join predicates. However, in many customer workloads, we found that string based columns are often used for filtering the data. As you can imagine, applying predicates on String values is lot more expensive than filtering on an Integer column and can slowdown analytics queries significantly specially on large DWs. The columnstore index in SQL Server 2016 allows string predicates to be pushed down to the SCAN node resulting in significant improvement in query performance. Like before, this speed up is automatic when you upgrade to SQL Server 2016 requiring no changes to your query or workload.

Methodology:

The String predicate push down leverages the ‘dictionaries’ that are created per column with each compressed rowgroups. Please refer to columnstore index overview and impact of dictionary on rowgroup size for some context. There are two kinds of dictionaries, Global and Local, but for this discussion, we will not differentiate between the two. The important point to note is that dictionary entries store the full column value and each column segment contains the reference to the dictionary entry. If same value is repeated multiple times, it is stored in the dictionary once but referenced multiple times. SQL Server 2016 utilizes dictionary entries to speed the string predicates.

Now, let us first look into how string predicates are processed in SQL Server 2014. The query in the picture below is counting number of orders that have the string ‘tool’ in its name. The ‘FILTER’ node applies the string predicate for each of the rows in BatchMode. Since there are 10 million rows scanned, there will be 10 million string comparisons. Note, the query processing is still benefitting from BatchMode processing

string-2

SQL Server 2016 leverage strings stored in the dictionary to minimize the string comparisons by pushing it to the SCAN node. If you assume that each item is repeated 100 times on average, then there are approximately 100K distinct values. The reason I indicated ‘approximate’ because assuming 10 compressed rowgroups, these values will be distributed randomly across these. SQL Server compares the strings stored in the dictionary and returns the rows that qualify. For example, if a dictionary entry matched with ‘%tool%’, then all referencing rows in the rowgroup are returned. So instead of comparing each value separately, we compared only one. For the example above, this allows us to reduce the string comparison by approximately 100x there by speeding up the query performance as shown in the picture below. I have made the picture a bit more complicated to show that 100K rows were in delta rowgroup where there is no dictionary so this optimization can’t be used but for other 9.9 million rows, the string predicate was applied to dictionary entries.

string-3

Now, let us take a real example and show its execution both on both versions on SQL Server

SQL Server 2014

As shown in the picture below, columnstore index was indeed executed in BatchMode and 11+ million rows were returned. So the string predicate was applied to all the rows at the ‘Filter’ node. The query execution time was as follows

SQL Server Execution Times:

CPU time = 3984 ms, elapsed time = 1185 ms

string-4

SQL Server 2016

SQL Server 2016 pushes the string predicate to the SCAN node and only 12 rows are returned unlike 11+ million rows in SQL Server 2014. Also, there is no explicit ‘Filter’ node because there were no delta RGs. The query execution time improved as well

SQL Server Execution Times:

CPU time = 2671 ms, elapsed time = 987 ms

string-5

Like before, you get the query performance improvement automatically when you upgrade to SQL Server 2016. This optimization is available across all database compatibility modes.

Couple of important points to note about string predicate pushdown

Works only on compressed rowgroups
We only allow up to 64K entries in the bitmap so if the number of entries in the dictionary are larger than that, this optimization is not available. I expect this to be a rare case

Happy upgrading to SQL Server 2016!

Thanks,

Sunil Agarwal

SQL Server Tiger Team

Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam

↧

Columnstore Index Performance: SQL Server 2016 – Window Aggregates in BatchMode

January 15, 2017, 11:53 pm

≫ Next: Change Tracking Cleanup – Part 2

≪ Previous: Columnstore Index Performance: SQL Server 2016 – String Predicate Pushdown

SQL product team has made significant implements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Please refer to one of the blogs No performance cliff for details on the database used for the examples here.

SQL Server 2016 introduces BatchMode execution model for aggregates computed over a set of rows defined by Over Clause. A set of rows so defined is referred to as ‘window’ and the aggregates are being computed on this set of rows are referred to as Window Aggregates. Please refer to an excellent introduction on Window functions/aggregates by one of its most passionate promoter Itzik Ben-Gen, SQL Server MVP, author and an excellent teacher!

Let us show case this using an example query as follows
SELECT Productkey, OrderQuantity as curqty, Sum (OrderQuantity) OVER (ORDER BY ProductKey) AS TotalQuantity FROM FactResellerSalesXL_CCI WHERE orderdatekey in (20060301,20060401)
SQL Server 2014

The picture below shows the actual execution plan. You will note that the aggregate computations was done in RowMode using stream aggregate. As expected, the SCAN of the columnstore index was in the BatchMode. The execution time of the query

SQL Server Execution Times:

CPU time = 139 ms, elapsed time = 396 ms winaggr-1

SQL Server 2016

The actual execution plan shown below has a new ‘Window Aggregate’ operator that executes in BatchMode. Also, the execution plan is lot more simpler. The execution time with these changes is as follows. Note, the CPU taken is much lower with this execution.

SQL Server Execution Times:

CPU time = 62 ms, elapsed time = 228 ms

winaggr-2

While these results may not appear as dramatic on my laptop, the picture below shows the performance gains with Window Aggregates on a Server class machine with large DW database. The orange bar represents the query speed up we got with Window Aggregate operator in BatchMode. The highest speed up we saw was 289x!!

winaggr-3

Like before, you get this performance boost by just upgrading to SQL Server 2016! This performance improvement is available with 130 DBCOMPAT.

Thanks,
Sunil Agarwal
SQL Server Tiger Team
Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam

↧

Change Tracking Cleanup – Part 2

January 17, 2017, 11:46 am

≫ Next: Columnstore Index: SQL Server 2016 – Improved DMV performance

≪ Previous: Columnstore Index Performance: SQL Server 2016 – Window Aggregates in BatchMode

In the first part of my Change Tracking Cleanup post, I talked about how automatic and manual cleanup happens in SQL Server. In this post, we will explore a bit more in depth on how the cleanup actually works with the help of some metadata from an actual Change Tracking implementation.

As mentioned in my last post, ChangeTracking auto cleanup is a background thread which wakes up at a fixed frequency and purges expired records (records beyond retention period) from the change tracking side tables. There are two cleanup versions that this thread maintains over the course of the cleanup action – invalid cleanup version and hardened cleanup version. When the thread wakes up, it determines the invalid cleanup version. The invalid cleanup version is the change tracking version which marks the point till which the auto cleanup task will perform the cleanup for the side tables. The autocleanup thread traverses through the tables that are enabled for change tracking and calls an internal stored procedure in a while loop, deleting 5000 rows in a single call within the while loop. The loop is terminated only when all the expired records in the side table are removed. This delete query uses the syscommittab table (an in-memory rowstore ) to identify the transaction IDs that have a commit timestemp less than the invalid cleanup version. This process is repeated until the cleanup is done with all change tracking side tables for that particular database. Once this is done with the final change tracking side table, it updates the hardened cleanup version to the invalid cleanup version.

Every time a checkpoint is run, an internal procedure is called that uses the hardened cleanup version and deletes a minimum of 10k records from sys.syscommittab table after they are flushed to the disk-based side tables. As you can see, both cleanup (in-memory rowstore and disk based side tables) are inter-dependent and having an issue with one of these might affect the other cleanup, eventually leading to unnecessary records in sys.syscommittab and delays in CHANGETABLE functions. See screenshot below of an extended event session tracing the checkpoint of a database which shows operations on the sys.syscommittab table.

Below is the output of calling the stored procedure for manual cleanup stored procedure, “sp_flush_CT_internal_table_on_demand”. I had inserted 50K rows and random updates to three tables t2, t3 and t4. The change data was cleaned up by the automatic cleanup post the retention period. Post the cleanup, I inserted another 50K rows into the table t2. After that I executed the manual cleanup procedure which did not have any cleanup to perform as the change data was within the retention period.

Cleanup Watermark = 103016

Internal Change Tracking table name : change_tracking_885578193

Total rows deleted: 0.

— Query to fetch cleanup version for a change tracking table

select

object_name (object_id) as table_name,

is_track_columns_updated_on,

min_valid_version,

begin_version,

cleanup_version

from sys.change_tracking_tables

The screenshot of the SSMS output grid that you see below is from the query above.

A new extended event, “change_tracking_cleanup”, was added to track change tracking automatic cleanup activities. The T-SQL script used to fetch the information below can be found on our tigertoolbox github repository.

As you can see from the screenshot below, the cleanup task shows you when the cleanup started and completed. Additionally, you get granular details like when the retention timestamp was updated which is an easy way of co-relating the invalid cleanup version to a timestamp value (see UpdateRetention and UpdateInvalidCleanup steps below). The side table object IDs shown below have line items reflecting the number of rows cleaned up and the start and end of the change tracking cleanup. One aspect to keep in mind is that the update retention timestamp is reflected in UTC and you will need to do the necessary conversion to get the time aligned with the server’s local timezone.

To summarize, we suggest the following steps when troubleshooting change tracking cleanup issues:

1. Ensure that auto cleanup is working properly using the Extended Event “change_tracking_cleanup”

2. If automatic cleanup is running slowly, then you can execute the stored procedure “sp_flush_CT_internal_table_on_demand”. In SQL Server 2014 Service Pack 2 and above, we provided a new Stored Procedure, sp_flush_CT_internal_table_on_demand, to assist with Change Tracking cleanup. KB3173157 has more details.

Amit Banerjee (@banerjeeamit)

↧

Columnstore Index: SQL Server 2016 – Improved DMV performance

January 18, 2017, 2:41 pm

≫ Next: Unicode strings and implicit conversions

≪ Previous: Change Tracking Cleanup – Part 2

SQL product team has made significant improvements in columnstore index functionality, supportability and performance in SQL Server 2016 based on the customer feedback. This blog post focuses on the performance improvements on a DMV done as part of our latest servicing release of SQL Server 2016, SQL Server 2016 SP1 CU1 and SQL Server 2016 RTM CU4. For detailed list of other performance enhancements in columnstore in SQL Server 2016, please refer to blog post series from Sunil.

In the latest servicing release for SQL Server 2016, we have modified the DMV sys.dm_db_column_store_row_group_physical_stats to remove some internal inefficiencies which results into improved query performance and reduced memory grant requirement by the DMV. The reduced memory grant requirement minimizes the interference of querying this DMV on the concurrent user workload running on the server. This DMV is commonly used by DBAs, for example, to identify index fragmentation or the reason why some of the compressed rowgroups have less than < 1 million rows.

These changes give us around 30% speed up in executing this DMV on a large clustered columnstore index with 100 thousand rowgroups and 10 million column segments.

Select * from sys.dm_db_column_store_row_group_physical_Stats

	Elapsed time seconds)	Memory grant (MB)
SQL 2016 SP1	46	1235
SQL 2016 SP1 CU1	33	25

If we use Showplan comparison tool integrated in SSMS to compare the plan before SQL 2016 SP1 and SQL 2016 SP1 CU1, you will see the following difference in memory grant requirement below.

We highly recommend you to plan, test and apply the latest servicing release of SQL Server 2016 to your production workload to leverage this improvement.

Parikshit Savjani
Senior PM, SQL Server Tiger Team
Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam

↧

Unicode strings and implicit conversions

January 23, 2017, 2:00 pm

≫ Next: Introducing VDC_Complete for Backup and Restore applications using SQLVDI

≪ Previous: Columnstore Index: SQL Server 2016 – Improved DMV performance

Recently we had an interesting customer question about a seemingly strange behavior (and perhaps not widely known) on implicit conversions to Unicode.

Imagine you declare a non-unicode string variable, and when concatenating strings that seem to fit the variable declaration, you get a result that is trimmed, although the sum of all string sizes is not going over the variable data type limit. Maybe it’s best to use an example with two strings, my ‘String’ and 4,000 a’s.

DECLARE @var VARCHAR(8000)
SELECT @var = 'String' + N'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
SELECT DATALENGTH(@var), LEN(@var)

So we have a limit of 8,000 characters in the variable, but after concatenating a 6 character non-unicode string with a 4,000 character unicode expression, the output was trimmed to 4,000 characters, and not the expected 4,006.

However, the observation is that instead of 4,000 a’s, if we had more (still keeping the N prefix), then we get the expected number of characters in the concatenated string.

Here’s an example adding another a:

DECLARE @var VARCHAR(8000)
SELECT @var = 'String' + N'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
SELECT DATALENGTH(@var), LEN(@var)

The reason behind this is behavior is that when prefixing a string constant with the letter N, the implicit conversion will result in a unicode string if the constant to convert does not exceed the max length for unicode string data type (4,000). Otherwise, the implicit conversion will result in a unicode large-value (max).

In other words, what happens in the first case is:

Right-hand side expression is implicitly converted to a unicode string NVARCHAR(4000).
Concat follows the rules of precedence, so the entire concatenation is bound by the unicode string data limit (therefore trimmed to 4,000 characters).
Expression is assigned and converted to the variable data type VARCHAR(8000).

But in the second example, when concatenating ‘String’ with a larger than 4,000 character unicode string, the implicit conversion for the a’s is an NVARCHAR(max), and so the concatenation includes all expected characters.

Pedro Lopes (@sqlpto) – Senior Program Manager

↧

Introducing VDC_Complete for Backup and Restore applications using SQLVDI

January 24, 2017, 9:00 am

≫ Next: SQL Server Mysteries: The Case of TDE and Permanent Tempdb Encryption

≪ Previous: Unicode strings and implicit conversions

In addition to its built-in functionality for backup and restore, SQL Server is supported by a large number of third-party backup solutions. SQL Server provides application programming interfaces (APIs) that enable independent software vendors to integrate SQL Server backup and restore operations into their products. These APIs are engineered to provide maximum reliability and performance, and support the full range of SQL Server backup and restore functionality, including the full range of hot and snapshot backup capabilities. In the current implementation of the SQL Server Virtual Backup Device Interface (VDI) protocol, the last message sent from SQL Server to the VDI client will be a VDC_Flush command. To prevent data loss, the VDI client must finish the backup before responding to the VDC_Flush command. There are certain situations like during backups of filestream enabled databases where a VDC_Flush command can be sent more than once during a backup operation. For certain backup applications, processing more than one VDC_Flush might be a challenge. If the VDI client responds to a VDC_Flush command without ensuring the backup is hardened when more data is coming after the VDC_Flush, SQL Server may truncate the transaction log. However, if the backup eventually fails on the VDI client, and the transaction log is also truncated, data loss might occur. If you don’t test your log backups at regular intervals, you wouldn’t figure out that you have a broken transaction log chain till the time you need to actually execute disaster recovery.

If you want to simulate a backup for your SQL Server instance, then you use the SQL Server Backup Simulator which is available on our tigertoolbox GitHub repository. The updated SQLVDI header files required to use VDC_Complete is available on the Microsoft SQL Server Samples GitHub repository.

Improvement

A new change was introduced in SQL Server 2012, SQL Server 2014 and SQL Server 2016 to allow backup and restore applications to know when SQL Server has completed sending the data to the client (VDI) so that it can perform the necessary end of backup tasks. KB3188454 has details about the change. This update adds a new VDI command VDC_Complete that indicates SQL Server has completed sending data to the VDI client. Therefore, the VDI client will be able to finish the backup before it sends response to SQL Server. This functionality allows the VDI client to fail the backup in case something goes wrong, and also prevents the transaction log being truncated without hardening the log backup by the client application.

The improvement was designed keeping backward compatibility in mind since backup applications can target multiple releases and versions of SQL Server at the same time. There can be four different scenarios which are outlined in the table below.

SQL Server Instance (VDI Server)	Backup Application (VDI Client)	Behavior
Supports VDC_Complete	Supports VDC_Complete	Client has to request VDF_RequestComplete while fetching the configuration to let the server know that it understands VDC_Complete. Once the server sends back a confirmation using the VDI configuration that it supports VDC_Complete, the client needs to execute the appropriate code path to handle VDC_Complete
Supports VDC_Complete	Does not support VDC_Complete	Since client does not request VDF_RequestComplete while fetching the configuration, server proceeds using previous behavior to maintain backward compatibility
Does not support VDC_Complete	Supports VDC_Complete	Server will return a NULL response because it does not support VDC_Complete for the requested feature VDF_RequestComplete
Does not support VDC_Complete	Does not support VDC_Complete	Behaves with legacy behavior of using only VDC_Flush

VDC_Complete is available for both scenarios backup and restore. If you want to use VDC_Complete for a database restore, then that is possible as well. If you choose to do so, then you will need to negotiate (as shown in the sample below) the use of VDC_Complete before the restore while fetching the VDI configuration.

Sample Code

Let us now look at the code changes required on the client side application which will help backup application work

I am going to use references from the sample simple.cpp file available in “SQL Server Virtual Backup Device Interface (VDI) Specification”. The download location is available in the references listed at the end of this post.

A handshake was implemented for the server and client to negotiate if VDC_Complete is supported by either. This can be done by the client requesting for the VDF_RequestComplete configuration. When the server receives this feature request, it will know that the client understands VDC_Complete and will respond accordingly indicating that it supports VDC_Complete.

// Setup the VDI configuration we want to use.

memset (&config, 0, sizeof(config));

config.deviceCount = 1;

// Request for VDC_Complete feature from the server

config.features = VDF_RequestComplete;

Once the client receives the configuration, it needs to check the features available (see below) by determining if VDF_CompleteEnabled is set. Once the client determines that the server supports VDC_Complete, it can execute the code path which does the appropriate processing (end of backup book keeping, closing the backup etc.) after it receives the VDC_Complete message.

hr = vds->GetConfiguration (10000, &config);

if (!SUCCEEDED (hr))

{

printf_s (“\nError: VDS::Getconfig fails: 0x%X\n”, hr);

if (hr == VD_E_TIMEOUT)

{

printf_s(“\nError: Failed to retrieve VDI configuration due to timeout value (10,000 ms).\n”);

}

goto shutdown;

}

// Determine if the server supports VDC_Complete based on configuration parameters returned

if (!(config.features & VDF_CompleteEnabled))

{

printf_s(“\nServer does not support VDC_Complete.”);

}

else

{

printf_s(“\nServer supports VDC_Complete.”);

}

When the backup application receives a VDC_Complete, the backup application will need to harden the backup and complete book keeping tasks before it acknowledges success for the VDC_Complete message (see below). This will ensure that SQL Server does not advance the LSN without the client application hardening the backup which could lead to a potential data loss situation.

case VDC_Complete:

// Ensure that book keeping is completed.

printf_s(“\n\nSQL Server has signaled the end of the operation.”);

// Harden the backup and close the file

completionCode = ERROR_SUCCESS;

break;

Reference

How It Works: SQL Server Backup Buffer Exchange (a VDI Focus)

SQL Server Virtual Backup Device Interface (VDI) Specification

SQL Server Backup Simulator

Updated SQLVDI Header files required for VDC_Complete

↧

SQL Server Mysteries: The Case of TDE and Permanent Tempdb Encryption

January 26, 2017, 8:47 am

≫ Next: Data Migration Assistant (DMA) v3.0 is now available

≪ Previous: Introducing VDC_Complete for Backup and Restore applications using SQLVDI

I’m a huge Sherlock Holmes fan (I’ve read all the books, watch Elementary on CBS every week, and loved the most recent season Four of Sherlock) so when I recently got a question about some unexplained behavior for SQL Server, I thought of the idea of posting some of these as I get and solve them in the form of a blog series titled SQL Server Mysteries (#sqlmystery). My goal is to resolve mysteries about SQL Server I encounter but do this without going straight the source code first. Rather I’ll use our source and other colleagues of mine at Microsoft to validate the answer.

The first case that helped me start this journey was asked by a MVP within the SQL Server community (his first name starts with Joey<g>).

The question went something like this…

“Quick question about a behavior we’re seeing at a customer and in testing. After enabling TDE, we are still seeing TempDB show up as encrypted, as expected, however, after disabling TDE, TempDB still shows as encrypted”.

First, let’s explain the mystery in more detail.

Transparent Data Encryption (TDE) is a feature that was introduced in SQL Server 2008 (and is also available for Azure SQL Database, Azure SQL Data Warehouse, and Parallel Data Warehouse) with the purpose of encrypting your data at rest. That is to ensure your database is encrypted at the file level. Which means that if someone was able to grab your SQL Server database and/or transaction log file, they could not see its contents simply by opening the file (Example. your laptop is stolen and the thief yanks out the hard drive and tries to inspect the files outside of using SQL Server).

The process to enable this for a database is described in our documentation. You effectively “turn on” encryption by using ALTER DATABSE SET ENCRYPTION ON. And we state in the documentation the following regarding tempdb:

Transparent Data Encryption and the tempdb System Database

The tempdb system database will be encrypted if any other database on the instance of SQL Server is encrypted by using TDE. This might have a performance effect for unencrypted databases on the same instance of SQL Server. For more information about the tempdb system database, see tempdb Database

What the documentation doesn’t say is what happens when you decrypt all user databases. I’ve seen many in the community say that once you encrypt a user database, tempdb will be permanently encrypted.

At first glance in SQL Server 2016, you can see if a database is encrypted for TDE by looking at sys.databases.is_encrypted. If I enable encryption for a user database using ALTER DATABSE, you will see results like this: (Note: that sys.databases.is_encrypted only shows 1 starting in SQL Server 2016).

And you can use this DMV to see more details about encryption:

select * from sys.dm_database_encryption_keys

The encryption_state column is documented as:

So you can see that both the user database and tempdb are encrypted. What does encrypted mean at this point? Well for a user database, when you run ALTER DATABASE SET ENCRYPTION ON, we actually create new SQL background tasks to encrypt all pages of the database file and current contents of the log file. Any future I/O for page or log will be encrypted “as we write”. For tempdb, we don’t encrypt the current contents of the database but any future I/O would be encrypted. A common I/O path for tempdb are sort spills and workfile I/O for hash joins (For gory details about tempdb internals see this PASS talk from 2011).

I’ve seen comments like this to indicate that when all user databases have encryption turned off, tempdb remains encrypted until a server restart. This connect item was marked by design because until the server is restarted some of your data that was encrypted from a user database could still exist in tempdb files. But what about after a server restart and comments about tempdb encryption being permanent? Knowing that tempdb is recreated after a server restart, my working theory is that it is not truly permanently encrypted. But how do I prove this?

We simply must find a way to trace the encryption of tempdb when it is enabled. Then we will use that same technique to observe it doesn’t happen when we believe it is disabled despite what sys.databases indicates.

The #1 tool I use at my disposal for tracing these days is Extended Events I searched through the available Extended Events that may have something to do with TDE and/or encryption and found these:

select * from sys.dm_xe_objects where (name like ‘%tde%’ or name like ‘%encrypt%’) and object_type = ‘event’

The only one that looked usable is database_tde_encryption_scan_duration. But when I used that I found that is only used to show a scan of an existing user database when you enable encryption (e.g. scan the existing data and encrypt it). That is not fired for tempdb (even when encryption for tempdb is considered enabled).

Vowing not to go to the source yet, I decide to use the Windows Debugger and public symbols (so you to can do this at home). Since we are trying to trace encryption, there must be Windows APIs SQL Server uses to encrypt data. Turns out there are a few options namely: BCryptEncrypt() and CryptEncrypt(). According to the MSDN docs, BCryptEncrypt is the “nextgen” API so I’ll choose that one to try and “trace”. Next, I need a scenario that requires a page write for tempdb. The easiest candidate for this is a sort spill. These are seen when you see a Sort Warning (XEvent also has a sort warning). The basic concept is that a sort as part of a query plan is executed but cannot be completely done in memory. So we must spill part of the sort to disk to complete the sort operation. How do I create one of these sort spills? First you must build a scenario that requires a sort, which is easy with an ORDER BY on a column that is not indexed. Second, just make the sort large but limit the memory of SQL Server with ‘max server memory’.

Here is the repro. First set it up by running this from SQL Server Management Studio.

use master
go
drop database yourdb
go
create database yourdb
go
use yourdb
go
drop table yourtable
go
create table yourtable (col1 int, col2 char(7000) not null)
go
set nocount on
go
begin tran
declare @x int
set @x = 0
while (@x < 100000)
begin
insert into yourtable values (@x, ‘your row’)
set @x = @x + 1
end
commit tran
go
sp_configure ‘show advanced’, 1
go
reconfigure
go
sp_configure ‘max server memory’, 8192
go
reconfigure
go

Now let;’s run the query and use show statistics to see if a spill occurs

set statistics xml on
go
select * from yourtable
order by col2
go

Note: Be sure to set your ‘max server memory’ back to 0 when you are done with this fun exercise.

The resulting execution plan should have something like this with the Sort operator

Now that we have our repro, let’s use the debugger to see for these sort spills whether the BCryptEncrypt() is called. We can do this by setting up encryption and encrypting the user database we created called yourdb with these steps:

USE master
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = ‘<Your Strong Passsword’
go
CREATE CERTIFICATE MyServerCert WITH SUBJECT = ‘My DEK Certificate’
go
USE yourdb
GO
CREATE DATABASE ENCRYPTION KEY
WITH ALGORITHM = AES_128
ENCRYPTION BY SERVER CERTIFICATE MyServerCert;
GO
ALTER DATABASE yourdb
SET ENCRYPTION ON
go

I now have a spill repro and TDE enabled for a user database. I queried select * from sys.dm_database_encryption_keys to make sure the user db and tempdb are setup for TDE. I will now run the Windows Debugger, attach it to the running SQL Server process using public symbols, and set a breakpoint on BCryptEncrypt().

From my powershell command prompt I run:

windbg -y srv*c:\public_symbols*https://msdl.microsoft.com/download/symbols -pn sqlservr.exe

The –y parameter is the symbol path. c:\public_symbols is a symbol cache and a folder I created on my laptop. This means when symbols are loaded from the http location (the public symbol server) they will be downloaded into that local folder so the next time loading symbols is fast.

Up comes the Debugger and I use the bp command to set a breakpoint on bcrypt!BCryptEncrypt (the BCryptEncrypt API documentation says this function is implemented in bcrypt.dll)

After typing in g to “go”, I go back and run the query again that caused the sort spill.

in my debugger it “broke in” by hitting the breakpoint. I used the k command to dump out the call stack to see where in the code it hit this API call (not the full stack but enough to see the story)

The “top” of the stack is our breakpoint for BCryptEncrypt. Notice what is below it. You can see from function calls like QScanSortNew… this is for a sort. The functions in between are for allocating memory for sorts (the get_bob_buf(). No relation to the author. bob stands for big output buffer). The Bob::IssueWrite is effectively our “sort spill” and the Page::Encrypt() is the call to “encrypt the page” (before it is written to disk in tempdb).

OK. Now we need to disable TDE for our user database and restart SQL Server. Then go back and basically do the same thing. Make sure you get a sort spill from the query. Then use the debugger and breakpoint to see if you hit BCryptEncrypt() again. The steps here are to disable TDE for the user database with ALTER DATABASE, end the debugger session with the .detach and then q to exit the debugger. Then restart the SQL Server service. Now repeat the steps above with the debugger to set the breakpoint, ‘g’, and run the sort spill query.

In my test I did not hit the breakpoint. And furthermore, you will notice that when you query sys.dm_database_encryption_keys, there is no row for tempdb at all. So our debugger breakpoint proves that tempdb is not permanently encrypted. Instead, if ALL user databases have TDE disabled and you restart SQL Server, tempdb is no longer encrypted. So instead of using sys.databases, use sys.dm_database_encryption_keys to tell which databases are truly enabled for encryption. I then verified my findings in the source code. Basically, we only enable encryption for tempdb if 1) ALTER DATABASE enables any user db for TDE 2) When we startup a user database and have encryption enabled. I also verified the behavior with my colleagues in the Tiger Team (thank you Ravinder Vuppula). We will look at fixing the issue with sys.databases in the future (ironically as I said earlier it was never enabled for tempdb before SQL Server 2016).

I hope you enjoyed not only the details of the mystery but the techniques involved to solve it. Look for more SQL Mysteries and their solutions on this blog

Bob Ward

↧

Data Migration Assistant (DMA) v3.0 is now available

January 27, 2017, 8:14 pm

≫ Next: Upgrading a Replication Topology to SQL Server 2016

≪ Previous: SQL Server Mysteries: The Case of TDE and Permanent Tempdb Encryption

Data Migration Assistant (DMA) enables you to upgrade to a modern data platform by detecting compatibility issues that can impact database functionality on your new version of SQL Server. It recommends performance and reliability improvements for your target environment. It also allows you to not only move your schema, data, but also uncontained objects from your source server to your target server.

DMA replaces all previous versions of SQL Server Upgrade Advisor and should be used for upgrades for most SQL Server versions.

In v3.0, DMA enables assessment of your on-premise SQL Server instance before migrating to Azure SQL database. The assessment workflow helps you to detect the following issues that can affect your Azure SQL database migration.

Issues blocking migration
Partially or unsupported features and functions

The new Azure SQL database assessment also provides comprehensive recommendations that helps fix the issues reported.

For more information, see https://blogs.msdn.microsoft.com/datamigration/2017/01/25/data-migration-assistant-dma-v3-0/

Ajay Jagannathan (@ajaymsft)

Principal Program Manager

↧

Upgrading a Replication Topology to SQL Server 2016

February 1, 2017, 9:00 am

≫ Next: Tigers Down Under at Difinity, Ignite and SQL Saturday

≪ Previous: Data Migration Assistant (DMA) v3.0 is now available

SQL Server Replication provides multi-faceted data movement capabilities across SQL Server releases which has been used by customers across the globe for a large number of years. When moving from one major release of SQL Server to another, replication topology upgrade has been a constant topic of lengthy discussions. In this post, we shall outline some of the challenges of upgrading SQL Server replication environments to SQL Server 2016. The requirements of upgrading a replication topology need to abide by the following guidelines:

A Distributor can be any version as long as it is greater than or equal to the Publisher version (in many cases the Distributor is the same instance as the Publisher).
A Publisher can be any version as long as it less than or equal to the Distributor version.
Subscriber version depends on the type of publication:
- A Subscriber to a transactional publication can be any version within two versions (n-2) of the Publisher version. For example: a SQL Server 2012 Publisher can have SQL Server 2014 and SQL Server 2016 Subscribers; and a SQL Server 2016 Publisher can have SQL Server 2014 and SQL Server 2012 Subscribers.
- A Subscriber to a merge publication can be any version less than or equal to the Publisher version.

If you had to draw a support matrix for the major release versions for transactional and merge replication, the output would be the two tables shown below.

Transactional Replication Matrix

Publisher	Distributor	Subscriber
SQL Server 2016	SQL Server 2016	SQL Server 2016 SQL Server 2014 SQL Server 2012
SQL Server 2014	SQL Server 2016 SQL Server 2014	SQL Server 2016 SQL Server 2014 SQL Server 2012 SQL Server 2008 R2 SQL Server 2008
SQL Server 2012	SQL Server 2016 SQL Server 2014 SQL Server 2012	SQL Server 2016 SQL Server 2014 SQL Server 2012 SQL Server 2008 R2 SQL Server 2008
SQL Server 2008 R2 SQL Server 2008	SQL Server 2016 SQL Server 2014 SQL Server 2012 SQL Server 2008 R2 SQL Server 2008	SQL Server 2014 SQL Server 2012 SQL Server 2008 R2 SQL Server 2008 SQL Server 2005 SQL Server 2000

Merge Replication Support Matrix

Publisher	Distributor	Subscriber
SQL Server 2016	SQL Server 2016	SQL Server 2016 SQL Server 2014 SQL Server 2012
SQL Server 2014	SQL Server 2016 SQL Server 2014	SQL Server 2014 SQL Server 2012 SQL Server 2008 R2 SQL Server 2008
SQL Server 2012	SQL Server 2012	SQL Server 2012 SQL Server 2008 R2 SQL Server 2008
SQL Server 2008 SQL Server 2008 R2	SQL Server 2008 R2 SQL Server 2008	SQL Server 2008 R2 SQL Server 2008 SQL Server 2005 SQL Server 2000

If you notice the line items for SQL Server 2016, you will see that a topology is unable to support SQL Server 2016 in a number of scenarios when you are running SQL Server 2016 as a publisher. Replication topologies have three common deployment patterns as shown in the visio diagram below. The distributor could be on the publisher or subscriber or a remote distributor. We do come across different deployments of the publisher and subscriber which are a mix of standalone instances, SQL Server failover cluster instances or Always On Availability Group replica instances.

Depending on the deployment pattern, the upgrade path to SQL Server 2016 would be different. Let us explore the different possibilities. SQL Server offers two upgrade paths in general:

Side-by-side: This approach involves setting up a new parallel environment and moving the databases along with the associated instance level objects like logins, jobs etc. to the new environment.
In-place upgrade: With this approach, the SQL Server setup program upgrades the existing SQL Server installation by replacing the existing SQL Server bits with the SQL Server 2016 bits and then upgrades each of the system and user databases. For environments running SQL Server failover cluster instances or Always On Availability Groups, an in-place upgrade is combined with a rolling upgrade to minimize downtime.

The scenarios below apply to Transactional Replication (without P2P Replication, Queued Updating Subscription and Immediate Updating Subscription) and Merge Replication. The options below outline how a phased approach can be adopted for your replication topology upgrade so that you don’t have to upgrade all the SQL Server instances in one big upgrade operation.

A common approach that has been adopted for side-by-side upgrades of replication topologies is to move publisher-subscriber pairs in parts to the new side-by-side environment as opposed to a movement of the entire topology. This phased approach helps to control downtime and minimize the impact to a certain extent for the businesses dependent on replication.