top of page
Writer's pictureGeorge Lin

How CE Change Should Impact SQL Queries By Examples Part 5: Multiple Join Condition

Updated: Nov 16, 2021


Case 4: New CE estimate multiple join conditions differently


For joins with a conjunction of equality predicates, the legacy CE computes the selectivity of each equality predicate, assumes independence, and combines them. In contrast, the new CE estimates the join cardinality based on multi-column frequencies computed on the join columns.

The following example will show cardinality estimates from both CEs. To demonstrate, first drops the primary key constraint for the FactProductInventory table so that the unique keys won't be used for estimate.

USE [AdventureWorksDW2012];
GO
 
ALTER TABLE dbo.[FactProductInventory] DROP CONSTRAINT [PK_FactProductInventory];
GO
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On;  -- CardinalityEstimationModelVersion 70
GO
 
-- Enable "Include Actual Execution Plan"
 
SELECT  [fs].[OrderDateKey],
         [fs].[DueDateKey],
         [fs].[ShipDateKey],    
         [fi].[UnitsIn],
         [fi].[UnitsOut],
         [fi].[UnitsBalance]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey];  -- estimated number of rows: 48545.4
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off;  -- CardinalityEstimationModelVersion 130
GO
 
SELECT  [fs].[OrderDateKey],
         [fs].[DueDateKey],
         [fs].[ShipDateKey],    
         [fi].[UnitsIn],
         [fi].[UnitsOut],
         [fi].[UnitsBalance]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey];  -- estimated number of rows: 98984.5

Executing the multi-predicate query above shows the following row estimates for the legacy and new CE:

Actual Rows: 60398

New CE Estimated Rows: 98984.5

Legacy CE Estimated Rows: 48545.4

The legacy CE uses the selectivity of individual join predicates and combines them using multiplication. The estimate(48545.4) is lower than the actual number of rows(60398). In contrast, the new CE computes the join cardinality by estimating the number of distinct values for join columns from each side. The new CE takes the smaller of the two distinct counts and multiplies by the average frequency from both sides. The result, in this example, is an estimate(98984.5) that is higher than the actual number of rows(60398). For both CEs (legacy and new), the query plan shape is the same Hash Match join operation.

Continuing the example, the following command adds back the primary key constraint that was previously removed:

USE [AdventureWorksDW2012];
GO
 
ALTER TABLE dbo.[FactProductInventory]
ADD  CONSTRAINT [PK_FactProductInventory]
PRIMARY KEY CLUSTERED
(
[ProductKey] ASC,
[DateKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
GO
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On;  -- CardinalityEstimationModelVersion 70
GO
 
-- Enable "Include Actual Execution Plan"
 
SELECT  [fs].[OrderDateKey],
         [fs].[DueDateKey],
         [fs].[ShipDateKey],    
         [fi].[UnitsIn],
         [fi].[UnitsOut],
         [fi].[UnitsBalance]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey];  -- estimated number of rows: 60,398
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off;  -- CardinalityEstimationModelVersion 130
GO
 
SELECT  [fs].[OrderDateKey],
         [fs].[DueDateKey],
         [fs].[ShipDateKey],    
         [fi].[UnitsIn],
         [fi].[UnitsOut],
         [fi].[UnitsBalance]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey];  -- estimated number of rows: 60,398

Re-executing the previous multi-join condition example using both CE versions provides a 100% accurate estimate of 60,398 rows for the Merge Join operation(different plan shape). This is because the new and legacy CE leverage unique keys on join columns to help get accurate distinct-count estimates.

Actual Rows: 60398

New CE Estimated Rows: 60,398

Legacy CE Estimated Rows: 60,398


0 comments

Recent Posts

See All

Azure SQL How To - SQL Managed Instance

Task 1: Copy encryption certificate on SQL Server to SQL MI On the source SQL Server machine, in a query window, run script below to...

Comments


bottom of page