top of page
Writer's pictureGeorge Lin

SQL Server Statistics Issues Part1

Key Concepts About Statistics:

  • Statistics are binary large objects (BLOBs) stored in the database metadata

  • Statistics contain statistical information about the distribution of values in one or more columns of a table or indexed view

  • The Query Optimizer uses statistics to estimate the cardinality(number of rows) in the query result

  • Statistics can be created over most, but not all, data type. Generally, only data types that can do comparison support the creation of statistics

  • SQL Server supports statistics on computed columns. This allow the Query Optimizer to make cardinality estimates over express

  • Each statistics object includes a histogram displaying the distribution of values in the first column of the set of key columns

  • The density vector contains one density for each prefix of columns in the statistics object.

  • Statistics on indexes cannot be dropped by using DROP STATISTICS. Statistics remain as long as the index exists.

Missing Statistics:

Missing statistics in association with specific query join or filter predicates should appear as a warning in the query execution plan. If statistics are missing and not generated automatically, run queries below to check if automatic creation of statistics is enable for the databases.

-- Verify AUTO_CREATE_STATISTICS OPTION
SELECT NAME, CASE IS_AUTO_CREATE_STATS_ON WHEN 1 THEN 'AUTO STATS CREATION is enabled' ELSE 'AUTO STATS CREATION is disabled' END AUTO_CREATE_STATISTICS_OPTION
FROM SYS.DATABASES
WHERE NAME='Your_DB_Name'

Automatic creation of statistics should be enable for most cases. if it's disabled for supportability or other reasons, consider creating the required statistics objects manually. Alternatively, if it's appropriate, create the associated index on the columns.


Key concepts about Auto_Create_Statistics option:

  • This option controls if the Query Optimizer, not anything else, should create statistics on individual columns in the query predicts so that the better cardinality estimates can be calculated for the query plan.

  • This option doesn't determine whether statistics get created for indexes. SQL Server automatically create an associated stats object for every rows store index even this option is off.

  • The single-column statistics are created only on columns that do not already have a histogram in an existing statistics object

  • This option doesn't generate filtered statistics, it applies strictly to single-column statistics for the full table

  • It's on by default and should remain on for most cases

Stale Statistics:

Outdated statistics may not necessarily be an issue if the data within the histogram and density vector still represent the current state of the data. Stale statistics can be problematic, however, if they no longer accurately represent the current data distribution. Check the statistics object using DBCC SHOW_STATISTICS to verify if the lack of recent updates is actually causing bad estimates. Additionally, verify the histogram and density vector. If stale statistics are an issue, verify that automatic statistics updates are enabled for the database using sys.databases and the is_auto_update_stats_on column. If these are disabled, verify that manual statistics maintenance is scheduled to occur on an appropriate schedule. Check if individual statistics objects are set to “no recompute”;


Key concepts about Auto_Update_Statistics

  • It's the Query Optimizer, not anything else, that determines if the statistics might be out-of-date and then update them when they are used by a query.

  • The Query Optimizer checks for out-of-date statistics before compiling a query and before executing a cached query plan.

  • The Query Optimizer counts the number of row modifications since the last statistics update and compares it to a statistics recompilation threshold

  • Starting with database compatibility level 130 (SQL Server 2016), for large (NumberOfRows (n) >=500) tables, the Database Engine calculate the statistics recompilation threshold (# modifications) using a new formula: MIN(500+(0.20*n),SQRT(1,000*n)).

  • For SQL Server 2012+, when statistics are updated, if NO data has changed, plans will NOT invalidate

  • This option applies to statistics objects created for indexes, single-columns in query predicates, and statistics manually created with the CREATE STATISTICS statement. It also applies to filtered statistics.

Note:

In SQL Server 2008 R2 through SQL Server 2014 (12.x), or in SQL Server 2016 (13.x) and later under database compatibility level 120 and lower, enable trace flag 2371 so that SQL Server uses a decreasing, dynamic statistics update threshold.


-- Verify AUTO_UPDATE_STATISTICS OPTION
SELECT NAME, CASE IS_AUTO_UPDATE_STATS_ON WHEN 1 THEN 'AUTO STATS UPDATE is enabled' ELSE 'AUTO STATS UPDATE is disabled' END AS AUTO_UPDATE_STATISTICS_OPTION
FROM SYS.DATABASES
WHERE NAME='YOUR_DB_NAME'
 
-- Verify whether the statistics were created with the NORECOMPUTE option
SELECT OBJECT_NAME(O.OBJECT_ID) AS OBJECT_NAME, S.NAME AS STATISTICS_NAME, S.STATS_ID, COL_NAME(SC.OBJECT_ID, SC.COLUMN_ID) AS COLUMN_NAME,
CASE S.NO_RECOMPUTE WHEN 1 THEN 'Statistics were created with the NORECOMPUTE option' ELSE 'Statistics were not created with the NORECOMPUTE option' END AS NORECOMPUTE_OPTION
FROM SYS.STATS AS S INNER JOIN SYS.STATS_COLUMNS AS SC ON S.STATS_ID = SC.STATS_ID AND S.OBJECT_ID = SC.OBJECT_ID 
INNER JOIN SYS.OBJECTS AS O ON S.OBJECT_ID = O.OBJECT_ID
WHERE OBJECT_NAME(O.OBJECT_ID) = 'YOUR_TABLE_NAME'
 
-- Returning all statistics properties for a table
SELECT SP.STATS_ID, NAME, FILTER_DEFINITION, LAST_UPDATED, ROWS, ROWS_SAMPLED, STEPS, UNFILTERED_ROWS, MODIFICATION_COUNTER  
FROM SYS.STATS AS STAT  
CROSS APPLY SYS.DM_DB_STATS_PROPERTIES(STAT.OBJECT_ID, STAT.STATS_ID) AS SP 
WHERE STAT.OBJECT_ID = OBJECT_ID('SCHEMA_NAME.OBJECT_NAME');
 
-- Returning statistics properties for frequently (>1000) modified objects
SELECT OBJ.NAME, OBJ.OBJECT_ID, STAT.NAME AS STATS_NAME, STAT.STATS_ID, LAST_UPDATED, MODIFICATION_COUNTER 
FROM SYS.OBJECTS AS OBJ  
INNER JOIN SYS.STATS AS STAT ON STAT.OBJECT_ID = OBJ.OBJECT_ID 
CROSS APPLY SYS.DM_DB_STATS_PROPERTIES(STAT.OBJECT_ID, STAT.STATS_ID) AS SP 
WHERE MODIFICATION_COUNTER > 1000;
 
-- Returning a list of statistics objects that haven't been updated since @days ago
DECLARE @DAYS INTEGER = 7;
 
SELECT OBJECT_NAME(O.OBJECT_ID) AS OBJECT_NAME, S.NAME AS STATISTICS_NAME, S.STATS_ID, COL_NAME(SC.OBJECT_ID, SC.COLUMN_ID) AS COLUMN_NAME,STATS_DATE(O.OBJECT_ID, S.STATS_ID) AS STATISTICS_UPDATE_DATE
FROM SYS.STATS AS S INNER JOIN SYS.STATS_COLUMNS AS SC ON S.STATS_ID = SC.STATS_ID AND S.OBJECT_ID = SC.OBJECT_ID 
INNER JOIN SYS.OBJECTS AS O ON S.OBJECT_ID = O.OBJECT_ID
WHERE OBJECT_NAME(O.OBJECT_ID) = 'Product' and STATS_DATE(O.OBJECT_ID, S.STATS_ID) < (GETDATE()-@DAYS)
 
-- displays the header, histogram, and density vector in the statistics object.
DBCC SHOW_STATISTICS ('SCHEMA_NAME.OBJECT_NAME', STATISTICS_NAME)


Statistics Object Sampling Issues

For larger tables, only a smaller percentage of pages are sampled so that the histogram, which is limited to 200 total steps, remains a reasonable size. If the column values are very "jagged" (uneven or occurrence), the histogram may not have adequate information representing the non-uniform data distribution. Consider gradually increasing the sample percentage during manual statistics update to improve histogram quality. Larger sample percentage can sometimes cause unacceptable performance overhead for very large tables.

0 comments

Recent Posts

See All

Azure SQL How To - SQL Managed Instance

Task 1: Copy encryption certificate on SQL Server to SQL MI On the source SQL Server machine, in a query window, run script below to...

Comments


bottom of page