top of page

How CE Change Should Impact SQL Queries By Examples Part 6: Complex Join Condition

Updated: Nov 16, 2021


Case 5: New CE simplify the calculation for complex join condition


With more complicated join conditions; for example, queries using a mix of equality and inequality join predicates, the legacy CE estimates the selectivity of individual join predicates and combines them using multiplication. The new CE, however, uses a simpler algorithm that assumes that there is a one-to-many join association between a large table and a small table. This assumes that each row in the large table matches exactly one row in the small table. This algorithm returns the estimated size of the larger input as the join cardinality.

The following example shows cardinality estimates without relying on unique keys. To demonstrate, the following script first drops the primary key constraint for the FactProductInventory table then run a query having a mix of equality and inequality join predicates.

USE [AdventureWorksDW2012];
GO
 
ALTER TABLE dbo.[FactProductInventory]
DROP CONSTRAINT [PK_FactProductInventory];
GO
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On;  -- CardinalityEstimationModelVersion 70
GO
 
-- Enable "Include Actual Execution Plan"
 
SELECT  [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey] AND
        [fs].[OrderQuantity] > [fi].[UnitsBalance];  -- estimated number of rows: 346.286
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off;  -- CardinalityEstimationModelVersion 130
GO
 
SELECT  [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey] AND
        [fs].[OrderQuantity] > [fi].[UnitsBalance];  -- estimated number of rows: 98436.9

The screenshot below shows row estimates for the new and legacy CE for the join algorithm operator (Hash Match).

Actual Rows: 22555

New CE Estimated Rows: 98436.9

Legacy CE Estimated Rows: 346.286

The legacy cardinality estimate assumes independence among the join predicates. This results in an underestimate, whereas the new CE simply estimates the join cardinality using cardinality from the larger child-operator input, resulting in an overestimate.

The following script adds back the primary key constraint that was previously removed and rerun the same query

USE [AdventureWorksDW2012];
GO
 
ALTER TABLE dbo.[FactProductInventory]
ADD  CONSTRAINT [PK_FactProductInventory]
PRIMARY KEY CLUSTERED
(
[ProductKey] ASC,
[DateKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
GO
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On;  -- CardinalityEstimationModelVersion 70
GO
 
-- Enable "Include Actual Execution Plan"
 
SELECT  [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey] AND
        [fs].[OrderQuantity] > [fi].[UnitsBalance];  -- estimated number of rows: 424.798
 
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off;  -- CardinalityEstimationModelVersion 130
GO
 
SELECT  [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM    dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
    ON  [fs].[ProductKey] = [fi].[ProductKey] AND
        [fs].[OrderDateKey] = [fi].[DateKey] AND
        [fs].[OrderQuantity] > [fi].[UnitsBalance];  -- estimated number of rows: 60398

The estimated number of rows is better (closer to the actual number of rows) for both new and legacy CE. the execution plan shapes also change from Hash Match to Merge Join from both CEs

Actual Rows: 22555

New CE Estimated Rows: 60398

Legacy CE Estimated Rows: 424.798


Recent Posts

See All

Comentarios


©2021 by GLIN. Proudly created with Wix.com

bottom of page