How CE Change Should Impact SQL Queries By Examples Part 6: Complex Join Condition
- George Lin
- Nov 1, 2021
- 2 min read
Updated: Nov 16, 2021
Case 5: New CE simplify the calculation for complex join condition
With more complicated join conditions; for example, queries using a mix of equality and inequality join predicates, the legacy CE estimates the selectivity of individual join predicates and combines them using multiplication. The new CE, however, uses a simpler algorithm that assumes that there is a one-to-many join association between a large table and a small table. This assumes that each row in the large table matches exactly one row in the small table. This algorithm returns the estimated size of the larger input as the join cardinality.
The following example shows cardinality estimates without relying on unique keys. To demonstrate, the following script first drops the primary key constraint for the FactProductInventory table then run a query having a mix of equality and inequality join predicates.
USE [AdventureWorksDW2012];
GO
ALTER TABLE dbo.[FactProductInventory]
DROP CONSTRAINT [PK_FactProductInventory];
GO
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On; -- CardinalityEstimationModelVersion 70
GO
-- Enable "Include Actual Execution Plan"
SELECT [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey] AND
[fs].[OrderQuantity] > [fi].[UnitsBalance]; -- estimated number of rows: 346.286
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off; -- CardinalityEstimationModelVersion 130
GO
SELECT [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey] AND
[fs].[OrderQuantity] > [fi].[UnitsBalance]; -- estimated number of rows: 98436.9
The screenshot below shows row estimates for the new and legacy CE for the join algorithm operator (Hash Match).
Actual Rows: 22555
New CE Estimated Rows: 98436.9
Legacy CE Estimated Rows: 346.286

The legacy cardinality estimate assumes independence among the join predicates. This results in an underestimate, whereas the new CE simply estimates the join cardinality using cardinality from the larger child-operator input, resulting in an overestimate.
The following script adds back the primary key constraint that was previously removed and rerun the same query
USE [AdventureWorksDW2012];
GO
ALTER TABLE dbo.[FactProductInventory]
ADD CONSTRAINT [PK_FactProductInventory]
PRIMARY KEY CLUSTERED
(
[ProductKey] ASC,
[DateKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
GO
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On; -- CardinalityEstimationModelVersion 70
GO
-- Enable "Include Actual Execution Plan"
SELECT [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey] AND
[fs].[OrderQuantity] > [fi].[UnitsBalance]; -- estimated number of rows: 424.798
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off; -- CardinalityEstimationModelVersion 130
GO
SELECT [fs].[OrderDateKey], [fi].[UnitCost], [fi].[UnitsIn]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey] AND
[fs].[OrderQuantity] > [fi].[UnitsBalance]; -- estimated number of rows: 60398
The estimated number of rows is better (closer to the actual number of rows) for both new and legacy CE. the execution plan shapes also change from Hash Match to Merge Join from both CEs
Actual Rows: 22555
New CE Estimated Rows: 60398
Legacy CE Estimated Rows: 424.798

Comentarios