top of page
Writer's pictureGeorge Lin

How CE Change Should Impact SQL Queries By Examples Part 7: Base or Simple Containment

Updated: Nov 16, 2021


Case 6: New CE use "base containment" instead of "simple containment"


The CE model for a join predicate involving an equijoin for two tables assumes that join columns should exist on both sides of the join. With additional non-join filter predicates against the join tables in the same query, the legacy CE assumes some level of correlation. This implied correlation is called “simple containment”.

Alternatively, the new CE uses “base containment” as an assumption. This means that the new CE assumes that the filter predicates on separate tables are not correlated with each other.

To illustrate the two different containment assumptions, the following script run a query that joins two tables with an equijoin predicate on ProductID. It also uses one filter predicate that references product color from the Product table. The second filter predicate references record-modified date on the SalesOrderDetail table:

USE [AdventureWorks2012]
GO
-- Enable "Include Actual Execution Plan"
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On;  -- CardinalityEstimationModelVersion 70
GO
 
SELECT [od].[SalesOrderID], [od].[SalesOrderDetailID]
FROM    Sales.[SalesOrderDetail] AS [od]
INNER JOIN Production.[Product] AS [p]
ON [od].[ProductID] = [p].[ProductID]
WHERE   [p].[Color] = 'Red' AND
[od].[ModifiedDate] = '2012-06-11 00:00:00.000'
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off;  -- CardinalityEstimationModelVersion 130
GO
 
SELECT [od].[SalesOrderID], [od].[SalesOrderDetailID]
FROM    Sales.[SalesOrderDetail] AS [od]
INNER JOIN Production.[Product] AS [p]
ON [od].[ProductID] = [p].[ProductID]
WHERE   [p].[Color] = 'Red' AND
[od].[ModifiedDate] = '2012-06-11 00:00:00.000'

“Simple containment” assumes, given this query, that sales order rows modified on June 11th in 2012 are all for red-colored products. In this scenario, this assumption is not accurate. This results in an overestimate of 10.4572 rows versus the actual 1 row.

Whereas the new estimated number of rows for the Nested Loops operation using “base containment” is 1.49388, down from 10.4572 rows estimated using “simple containment”. Removing the implied filter correlation moves the estimate closer to the actual number of rows.

Actual Rows: 1

New CE Estimated Rows: 1.49388

Legacy CE Estimated Rows: 10.4572

At a high level, the legacy CE join estimate assumes containment for any arbitrary inputs of the join. Any existing filter predicates load and scale down histograms before merging them using join predicates. The legacy CE behavior assumes that the non-join filter predicates are correlated. The legacy CE expects a higher estimate of row matches between the two tables. This expectation influences the cardinality estimate for the join operation:

Differently, the new CE derives the join selectivity from base-table histograms without scaling down using the associated filter predicates. Instead the new CE computes join selectivity using base-table histograms before applying the selectivity of non-join filters.

0 comments

Recent Posts

See All

Azure SQL How To - SQL Managed Instance

Task 1: Copy encryption certificate on SQL Server to SQL MI On the source SQL Server machine, in a query window, run script below to...

Comments


bottom of page