Case 4: New CE estimate multiple join conditions differently
For joins with a conjunction of equality predicates, the legacy CE computes the selectivity of each equality predicate, assumes independence, and combines them. In contrast, the new CE estimates the join cardinality based on multi-column frequencies computed on the join columns.
The following example will show cardinality estimates from both CEs. To demonstrate, first drops the primary key constraint for the FactProductInventory table so that the unique keys won't be used for estimate.
USE [AdventureWorksDW2012];
GO
ALTER TABLE dbo.[FactProductInventory] DROP CONSTRAINT [PK_FactProductInventory];
GO
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On; -- CardinalityEstimationModelVersion 70
GO
-- Enable "Include Actual Execution Plan"
SELECT [fs].[OrderDateKey],
[fs].[DueDateKey],
[fs].[ShipDateKey],
[fi].[UnitsIn],
[fi].[UnitsOut],
[fi].[UnitsBalance]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey]; -- estimated number of rows: 48545.4
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off; -- CardinalityEstimationModelVersion 130
GO
SELECT [fs].[OrderDateKey],
[fs].[DueDateKey],
[fs].[ShipDateKey],
[fi].[UnitsIn],
[fi].[UnitsOut],
[fi].[UnitsBalance]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey]; -- estimated number of rows: 98984.5
Executing the multi-predicate query above shows the following row estimates for the legacy and new CE:
Actual Rows: 60398
New CE Estimated Rows: 98984.5
Legacy CE Estimated Rows: 48545.4
The legacy CE uses the selectivity of individual join predicates and combines them using multiplication. The estimate(48545.4) is lower than the actual number of rows(60398). In contrast, the new CE computes the join cardinality by estimating the number of distinct values for join columns from each side. The new CE takes the smaller of the two distinct counts and multiplies by the average frequency from both sides. The result, in this example, is an estimate(98984.5) that is higher than the actual number of rows(60398). For both CEs (legacy and new), the query plan shape is the same Hash Match join operation.
Continuing the example, the following command adds back the primary key constraint that was previously removed:
USE [AdventureWorksDW2012];
GO
ALTER TABLE dbo.[FactProductInventory]
ADD CONSTRAINT [PK_FactProductInventory]
PRIMARY KEY CLUSTERED
(
[ProductKey] ASC,
[DateKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
GO
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = On; -- CardinalityEstimationModelVersion 70
GO
-- Enable "Include Actual Execution Plan"
SELECT [fs].[OrderDateKey],
[fs].[DueDateKey],
[fs].[ShipDateKey],
[fi].[UnitsIn],
[fi].[UnitsOut],
[fi].[UnitsBalance]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey]; -- estimated number of rows: 60,398
ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = Off; -- CardinalityEstimationModelVersion 130
GO
SELECT [fs].[OrderDateKey],
[fs].[DueDateKey],
[fs].[ShipDateKey],
[fi].[UnitsIn],
[fi].[UnitsOut],
[fi].[UnitsBalance]
FROM dbo.[FactInternetSales] AS [fs]
INNER JOIN dbo.[FactProductInventory] AS [fi]
ON [fs].[ProductKey] = [fi].[ProductKey] AND
[fs].[OrderDateKey] = [fi].[DateKey]; -- estimated number of rows: 60,398
Re-executing the previous multi-join condition example using both CE versions provides a 100% accurate estimate of 60,398 rows for the Merge Join operation(different plan shape). This is because the new and legacy CE leverage unique keys on join columns to help get accurate distinct-count estimates.
Actual Rows: 60398
New CE Estimated Rows: 60,398
Legacy CE Estimated Rows: 60,398
Comments