top of page
Writer's pictureGeorge Lin

Automatic Alerting High IO Latency On Azure SQL Managed Instances

This article shows my way of monitoring database disk IO latency on General Purpose SQL MI. To learn why IO latency could be a big performance concern, I would recommend starting here.



This solution use a temporal table, dbo.IO_VIRTUAL_FILE_STATS_SNAPSHOT, to track a full history of IO latency changes on all database files so that I can write code to do point in time analysis.

ALTERDATABASE [Admin_DB] SET TEMPORAL_HISTORY_RETENTION ON
GO

USE [Admin_DB]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[IO_VIRTUAL_FILE_STATS_SNAPSHOT](
	[DB_NAME] [sysname] NOT NULL,
	[DATABASE_ID] [smallint] NOT NULL,
	[FILE_NAME] [sysname] NOT NULL,
	[SIZE_GB] [int] NOT NULL,
	[FILE_ID] [smallint] NOT NULL,
	[IO_STALL_READ_MS] [bigint] NOT NULL,
	[IO_STALL_WRITE_MS] [bigint] NOT NULL,
	[IO_STALL_QUEUED_READ_MS] [bigint] NOT NULL,
	[IO_STALL_QUEUED_WRITE_MS] [bigint] NOT NULL,
	[IO_STALL] [bigint] NOT NULL,
	[NUM_OF_BYTES_READ] [bigint] NOT NULL,
	[NUM_OF_BYTES_WRITTEN] [bigint] NOT NULL,
	[NUM_OF_READS] [bigint] NOT NULL,
	[NUM_OF_WRITES] [bigint] NOT NULL,
	[TITLE] [nvarchar](500) NULL,
	[INTERVAL_MI] [bigint] NULL,
	[START_TIME] [datetime2](7) GENERATED ALWAYS AS ROW START NOT NULL,
	[END_TIME] [datetime2](7) GENERATED ALWAYS AS ROW END NOT NULL,
PRIMARY KEY CLUSTERED 
(
	[DB_NAME] ASC,
	[FILE_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY],
	PERIOD FOR SYSTEM_TIME ([START_TIME], [END_TIME])
) ON [PRIMARY]
WITH
(
SYSTEM_VERSIONING = ON 
	( HISTORY_TABLE = [dbo].[IO_VIRTUAL_FILE_STATS_SNAPSHOT_HISTORY],
	  HISTORY_RETENTION_PERIOD = 3 MONTHS
	)
)
GO

ALTER TABLE [dbo].[IO_VIRTUAL_FILE_STATS_SNAPSHOT]  WITH CHECK ADD CHECK  (([NUM_OF_READS]>=(0)))
GO

ALTER TABLE [dbo].[IO_VIRTUAL_FILE_STATS_SNAPSHOT]  WITH CHECK ADD CHECK  (([NUM_OF_WRITES]>=(0)))
GO

Step 2: Create a stored procedure to track file IO statistics

This stored procedure, dbo.Track_File_IO_Stat simply call the SQL Server dynamic management function named SYS.DM_IO_VIRTUAL_FILE_STATS and merge (update existing or insert new rows) the returned table into the temporal table dbo.IO_VIRTUAL_FILE_STATS_SNAPSHOT. The SP takes three input parameters:

  • @Title - the parameter value is used to update the Title column so that the rows can be queried basing the title. if not specified, the default is the timestamp when the row is updated or inserted

  • @DB_Name - specify the target database. If NULL is specified, all databases in the SQL Managed Instance are returned.

  • @File_Name - specify the target file in SYS.MASTER_FILES . If NULL is specified, all files of the target database are returned. Note that this parameter takes the logical file name, not a physical name(full path).

A SQL Agent job, which will be created in the later step, calls this SP as the firs execution step every time it runs.

USE [Admin_DB]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER OFF
GO

CREATE   
PROCEDURE [dbo].[Track_File_IO_Stat] @title nvarchar(200) = NULL, @DB_NAME SYSNAME = NULL, @FILE_NAME SYSNAME = NULL
AS BEGIN
MERGE DBO.IO_VIRTUAL_FILE_STATS_SNAPSHOT AS TARGET
USING (
	SELECT 	DB_NAME = DB_NAME(VFS.DATABASE_ID)
	,VFS.DATABASE_ID
	,FILE_NAME = [MF].[NAME]
	,SIZE_GB = 8. * MF.SIZE /1024/ 1024,[VFS].[FILE_ID]
	,[IO_STALL_READ_MS]
	,[IO_STALL_WRITE_MS]
	,[IO_STALL_QUEUED_READ_MS]
	,[IO_STALL_QUEUED_WRITE_MS]
	,[IO_STALL]
	,[NUM_OF_BYTES_READ]
	,[NUM_OF_BYTES_WRITTEN]
	,[NUM_OF_READS]
	,[NUM_OF_WRITES]
	FROM SYS.DM_IO_VIRTUAL_FILE_STATS (DB_ID(@DB_NAME),NULL) AS [VFS]
		JOIN SYS.MASTER_FILES AS [MF] ON [VFS].[DATABASE_ID] = [MF].[DATABASE_ID] 
		AND [VFS].[FILE_ID] = [MF].[FILE_ID]
		AND (@FILE_NAME IS NULL OR [MF].[NAME] = @FILE_NAME)
	) AS SOURCE
ON (TARGET.DB_NAME = SOURCE.DB_NAME AND TARGET.FILE_ID = SOURCE.FILE_ID)
WHEN MATCHED THEN
UPDATE SET
	TARGET.[SIZE_GB] = SOURCE.[SIZE_GB],
	TARGET.[IO_STALL_READ_MS] = SOURCE.[IO_STALL_READ_MS],
	TARGET.[IO_STALL_WRITE_MS] = SOURCE.[IO_STALL_WRITE_MS],
	TARGET.[IO_STALL_QUEUED_READ_MS] = SOURCE.[IO_STALL_QUEUED_READ_MS], 
	TARGET.[IO_STALL_QUEUED_WRITE_MS] = SOURCE.[IO_STALL_QUEUED_WRITE_MS], 
	TARGET.[IO_STALL] = SOURCE.[IO_STALL] ,
	TARGET.[NUM_OF_BYTES_READ] = SOURCE.[NUM_OF_BYTES_READ] ,
	TARGET.[NUM_OF_BYTES_WRITTEN] = SOURCE.[NUM_OF_BYTES_WRITTEN] ,
	TARGET.[NUM_OF_READS] = SOURCE.[NUM_OF_READS] ,
	TARGET.[NUM_OF_WRITES] = SOURCE.[NUM_OF_WRITES] ,
	Target.title = ISNULL(@title, CONVERT(VARCHAR(30), GETDATE(), 20)) ,
	TARGET.INTERVAL_MI = DATEDIFF_BIG(MI,TARGET.START_TIME, GETUTCDATE())
WHEN NOT MATCHED BY TARGET THEN
INSERT (DB_NAME
	,DATABASE_ID
	,FILE_NAME
	,SIZE_GB
	,[FILE_ID]
	,[IO_STALL_READ_MS]
	,[IO_STALL_WRITE_MS]
	,[IO_STALL_QUEUED_READ_MS]
	,[IO_STALL_QUEUED_WRITE_MS]
	,[IO_STALL]
	,[NUM_OF_BYTES_READ]
	,[NUM_OF_BYTES_WRITTEN]
	,[NUM_OF_READS]
	,[NUM_OF_WRITES]
	,title)
VALUES (SOURCE.DB_NAME
	,SOURCE.DATABASE_ID
	,SOURCE.FILE_NAME
	,SOURCE.SIZE_GB
	,SOURCE.[FILE_ID]
	,SOURCE.[IO_STALL_READ_MS]
	,SOURCE.[IO_STALL_WRITE_MS]
	,SOURCE.[IO_STALL_QUEUED_READ_MS]
	,SOURCE.[IO_STALL_QUEUED_WRITE_MS]
	,SOURCE.[IO_STALL]
	,SOURCE.[NUM_OF_BYTES_READ]
	,SOURCE.[NUM_OF_BYTES_WRITTEN]
	,SOURCE.[NUM_OF_READS]
	,SOURCE.[NUM_OF_WRITES]
	,ISNULL(@title, CONVERT(VARCHAR(30), GETDATE(), 20)) );
END
GO

Step 3: Create a Table-valued function to calculate IO latency

The table-valued function returns file IO latency statistics at any point in the past basing on the values of the following three input parameters:

  • @DATE_TIME - the parameter value is used in temporal table query sub-clause FOR SYSTEM_TIME AS OF to specify a point in time in the past. The row is returned if the START_TIME value is less than or equal to the @DATE_TIME parameter value and the END_TIME value is greater than the @DATE_TIME parameter value.

  • @DB_NAME - specify the target database. If NULL is specified, all databases in the SQL Managed Instance are returned.

  • @TITLE - If specified, return all rows with Title value equal @TITLE.

If you provide a valid value for both @DATE_TIME and @TITLE, the table returned by the function contains two rows for each file. The latency values are calculated by comparing the target rows (filtered by @DATE_TIME or @TITLE) with the reference rows in the preceding snapshot (target.START_TIME = Reference.END_TIME)


If both @DATE_TIME and @TITLE are null, the function calculate the IO latency by comparing the current statistic values returned from SYS.DM_IO_VIRTUAL_FILE_STATS with the most recent records in the temporal table dbo.IO_VIRTUAL_FILE_STATS_SNAPSHOT


USE [Admin_DB]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER OFF
GO

CREATE FUNCTION [dbo].[Calculate_IO_Latency](@DB_NAME SYSNAME = NULL, @DATE_TIME DATETIME2 = NULL, @TITLE NVARCHAR(100) = NULL)  -- DATE_TIME IS A POINT IN TIME IN THE PAST
RETURNS TABLE
AS RETURN (
	-- FOR TESTING: DECLARE @DB_NAME SYSNAME = DB_NAME(), @DATE_TIME DATETIME2 = NULL, @TITLE NVARCHAR(100) = NULL;
WITH Curr
(	[DB_NAME]
	,DATABASE_ID
	,[FILE_ID]
	,[SIZE_GB]
	,[IO_STALL_READ_MS]
	,[IO_STALL_WRITE_MS]
	,[IO_STALL_QUEUED_READ_MS]
	,[IO_STALL_QUEUED_WRITE_MS]
	,[IO_STALL]
	,[NUM_OF_BYTES_READ]
	,[NUM_OF_BYTES_WRITTEN]
	,[NUM_OF_READS]
	,[NUM_OF_WRITES]
	,TITLE
	,START_TIME
	,END_TIME)
	AS (
	SELECT S.DB_NAME
		,DATABASE_ID
		,[FILE_ID]
		,[SIZE_GB]
		,[IO_STALL_READ_MS]
		,[IO_STALL_WRITE_MS]
		,[IO_STALL_QUEUED_READ_MS]
		,[IO_STALL_QUEUED_WRITE_MS]
		,[IO_STALL]
		,[NUM_OF_BYTES_READ]
		,[NUM_OF_BYTES_WRITTEN]
		,[NUM_OF_READS]
		,[NUM_OF_WRITES]
		,TITLE
		,START_TIME
		,END_TIME
	FROM DBO.IO_VIRTUAL_FILE_STATS_SNAPSHOT FOR SYSTEM_TIME AS OF @DATE_TIME S
	WHERE @DATE_TIME IS NOT NULL AND (@DB_NAME IS NULL OR S.DB_NAME = @DB_NAME)

	UNION ALL

	SELECT S.DB_NAME
		,DATABASE_ID
		,[FILE_ID]
		,[SIZE_GB]
		,[IO_STALL_READ_MS]
		,[IO_STALL_WRITE_MS]
		,[IO_STALL_QUEUED_READ_MS]
		,[IO_STALL_QUEUED_WRITE_MS]
		,[IO_STALL]
		,[NUM_OF_BYTES_READ]
		,[NUM_OF_BYTES_WRITTEN]
		,[NUM_OF_READS]
		,[NUM_OF_WRITES]
		,TITLE 
		,START_TIME
		,END_TIME
	FROM DBO.IO_VIRTUAL_FILE_STATS_SNAPSHOT FOR SYSTEM_TIME ALL AS S
	WHERE @TITLE IS NOT NULL
	AND TITLE = @TITLE
	AND (@DB_NAME IS NULL OR S.DB_NAME = @DB_NAME)

	UNION ALL

	SELECT DB_NAME(S.DATABASE_ID) AS DB_NAME
		,S.DATABASE_ID
		,S.[FILE_ID]
		,[SIZE_GB]=8.*MF.SIZE/1024/1024
		,[IO_STALL_READ_MS]
		,[IO_STALL_WRITE_MS]
		,[IO_STALL_QUEUED_READ_MS]
		,[IO_STALL_QUEUED_WRITE_MS]
		,[IO_STALL]
		,[NUM_OF_BYTES_READ]
		,[NUM_OF_BYTES_WRITTEN]
		,[NUM_OF_READS]
		,[NUM_OF_WRITES]
		,TITLE = 'Latest'
		,START_TIME=GETUTCDATE()
		,END_TIME=CAST('9999-12-31T00:00:00.0000' AS DATETIME2)
	FROM SYS.DM_IO_VIRTUAL_FILE_STATS (DB_ID(@DB_NAME), NULL) S
	JOIN SYS.MASTER_FILES MF ON DB_NAME(MF.DATABASE_ID) = DB_NAME(S.DATABASE_ID) AND MF.FILE_ID = S.FILE_ID
	WHERE @TITLE IS NULL AND @DATE_TIME IS NULL
)
	SELECT
		DB_NAME = PREV.DB_NAME,
		CURR.DATABASE_ID,
		FILE_NAME = PREV.FILE_NAME,
		CURR.SIZE_GB,
		THROUGHPUT_MBPS
			= CAST((CURR.NUM_OF_BYTES_READ - PREV.NUM_OF_BYTES_READ)/1024.0/1024.0 / (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,2))
			+ CAST((CURR.NUM_OF_BYTES_WRITTEN - PREV.NUM_OF_BYTES_WRITTEN)/1024.0/1024.0 / (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,2)),
		READ_MBPS
			= CAST((CURR.NUM_OF_BYTES_READ - PREV.NUM_OF_BYTES_READ)/1024.0/1024.0 / (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,2)),
		WRITE_MBPS
			= CAST((CURR.NUM_OF_BYTES_WRITTEN - PREV.NUM_OF_BYTES_WRITTEN)/1024.0/1024.0 / (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,2)),
		IOPS
			= CAST((CURR.NUM_OF_READS - PREV.NUM_OF_READS + CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES)/ (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,0)),
		READ_IOPS
			= CAST((CURR.NUM_OF_READS - PREV.NUM_OF_READS)/ (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,0)),
		WRITE_IOPS
			= CAST((CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES)/ (DATEDIFF(MILLISECOND, PREV.START_TIME, CURR.START_TIME) / 1000.) AS NUMERIC(10,0)),
		LATENCY_MS
			= CASE WHEN ( (CURR.NUM_OF_READS - PREV.NUM_OF_READS) = 0 AND (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES) = 0)
				THEN NULL ELSE (CAST(ROUND(1.0 * (CURR.IO_STALL - PREV.IO_STALL) / ((CURR.NUM_OF_READS - PREV.NUM_OF_READS) + (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES)), 1) AS NUMERIC(10,1))) END,
		READ_LATENCY_MS
			= CASE WHEN (CURR.NUM_OF_READS - PREV.NUM_OF_READS) = 0
				THEN NULL ELSE (CAST(ROUND(1.0 * (CURR.IO_STALL_READ_MS - PREV.IO_STALL_READ_MS) / (CURR.NUM_OF_READS - PREV.NUM_OF_READS), 1) AS NUMERIC(10,1))) END,
		WRITE_LATENCY_MS
			= CASE WHEN (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES) = 0
				THEN NULL ELSE (CAST(ROUND(1.0 * (CURR.IO_STALL_WRITE_MS - PREV.IO_STALL_WRITE_MS) / (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES), 1) AS NUMERIC(10,1))) END,
		READ_IO_LATENCY_MS =
			CASE WHEN (CURR.NUM_OF_READS - PREV.NUM_OF_READS) = 0
				THEN NULL ELSE
			CAST(ROUND(((CURR.IO_STALL_READ_MS-CURR.IO_STALL_QUEUED_READ_MS) - (PREV.IO_STALL_READ_MS - PREV.IO_STALL_QUEUED_READ_MS))/(CURR.NUM_OF_READS - PREV.NUM_OF_READS),2) AS NUMERIC(10,2))
			END,
		WRITE_IO_LATENCY_MS =
		CASE WHEN (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES) = 0
				THEN NULL
				ELSE CAST(ROUND(((CURR.IO_STALL_WRITE_MS-CURR.IO_STALL_QUEUED_WRITE_MS) - (PREV.IO_STALL_WRITE_MS - PREV.IO_STALL_QUEUED_WRITE_MS))/(CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES),2) AS NUMERIC(10,2))
			END,
		KB_PER_READ
			= CASE WHEN (CURR.NUM_OF_READS - PREV.NUM_OF_READS) = 0
				THEN NULL ELSE CAST(((CURR.NUM_OF_BYTES_READ - PREV.NUM_OF_BYTES_READ) / (CURR.NUM_OF_READS - PREV.NUM_OF_READS))/1024.0 AS NUMERIC(10,1)) END,
		KB_PER_WRITE
			= CASE WHEN (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES) = 0
				THEN NULL ELSE CAST(((CURR.NUM_OF_BYTES_WRITTEN - PREV.NUM_OF_BYTES_WRITTEN) / (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES))/1024.0 AS NUMERIC(10,1)) END,
		KB_PER_IO
			= CASE WHEN ((CURR.NUM_OF_READS - PREV.NUM_OF_READS) = 0 AND (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES) = 0)
				THEN NULL ELSE CAST(
					(((CURR.NUM_OF_BYTES_READ - PREV.NUM_OF_BYTES_READ) + (CURR.NUM_OF_BYTES_WRITTEN - PREV.NUM_OF_BYTES_WRITTEN)) /
					((CURR.NUM_OF_READS - PREV.NUM_OF_READS) + (CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES)))/1024.0
					 AS NUMERIC(10,1)) END,
		READ_MB = CAST((CURR.NUM_OF_BYTES_READ - PREV.NUM_OF_BYTES_READ)/1024.0/1024 AS NUMERIC(10,2)),
		WRITE_MB = CAST((CURR.NUM_OF_BYTES_WRITTEN - PREV.NUM_OF_BYTES_WRITTEN)/1024.0/1024 AS NUMERIC(10,2)),
		NUM_OF_READS = CURR.NUM_OF_READS - PREV.NUM_OF_READS,
		NUM_OF_WRITES = CURR.NUM_OF_WRITES - PREV.NUM_OF_WRITES,
		INTERVAL_MI = DATEDIFF(MINUTE, PREV.START_TIME, CURR.START_TIME),
		[TYPE] = MF.TYPE_DESC
	FROM CURR
		JOIN DBO.IO_VIRTUAL_FILE_STATS_SNAPSHOT FOR SYSTEM_TIME ALL AS PREV
			ON CURR.DB_NAME = PREV.DB_NAME AND CURR.FILE_ID = PREV.FILE_ID AND CURR.DATABASE_ID = PREV.DATABASE_ID
			AND (
				((@DATE_TIME IS NOT NULL OR @TITLE IS NOT NULL) AND CURR.START_TIME = PREV.END_TIME)
				OR
				((@DATE_TIME IS NULL AND @TITLE IS NULL) AND PREV.END_TIME > GETUTCDATE())				
			)

		JOIN SYS.MASTER_FILES MF ON CURR.DB_NAME = DB_NAME(MF.DATABASE_ID) AND CURR.FILE_ID = MF.FILE_ID
	WHERE (@DB_NAME IS NULL OR @DB_NAME = PREV.DB_NAME)
)
GO

Step 4: Create a stored procedure to check file IO latency

This stored procedure, dbo.Check_File_IO_Performance, check the real time read/write file IO latency and automatically send out an email alert if the latency is higher than the defined threshold. The SP is called by the same SQL Agent job which calls dbo.Track_File_IO_Stat first then this one. Below is the two input parameters:

  • @IOLATENCY_WARN_LEVEL - specify read or write latency threshold in milliseconds

  • @EMAIL_LIST - specify who gets email alerts

First, create the table, dbo.SLOW_IO_FILES_HISTORY required by the stored procedure:

USE [Admin_DB]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[SLOW_IO_FILES_HISTORY](
	[DB_NAME] [sysname] NULL,
	[FILE_NAME] [sysname] NOT NULL,
	[SIZE_GB] [numeric](23, 11) NULL,
	[THROUGHPUT_MBPS] [numeric](11, 2) NULL,
	[MAX_THROUGHPUT] [int] NOT NULL,
	[READ_MBPS] [numeric](10, 2) NULL,
	[WRITE_MBPS] [numeric](10, 2) NULL,
	[IOPS] [numeric](10, 0) NULL,
	[MAX_IOPS] [int] NOT NULL,
	[READ_IOPS] [numeric](10, 0) NULL,
	[WRITE_IOPS] [numeric](10, 0) NULL,
	[LATENCY_MS] [numeric](10, 1) NULL,
	[READ_LATENCY_MS] [numeric](10, 1) NULL,
	[WRITE_LATENCY_MS] [numeric](10, 1) NULL,
	[READ_IO_LATENCY_MS] [numeric](10, 2) NULL,
	[WRITE_IO_LATENCY_MS] [numeric](10, 2) NULL,
	[KB_PER_READ] [numeric](10, 1) NULL,
	[KB_PER_WRITE] [numeric](10, 1) NULL,
	[KB_PER_IO] [numeric](10, 1) NULL,
	[READ_MB] [numeric](10, 2) NULL,
	[WRITE_MB] [numeric](10, 2) NULL,
	[NUM_OF_READS] [bigint] NULL,
	[NUM_OF_WRITES] [bigint] NULL,
	[INTERVAL_MI] [int] NULL,
	[TYPE] [nvarchar](60) NULL,
	[COLLECTION_TIME] [datetime] NOT NULL
) ON [PRIMARY]
GO

Then create the SP dbo.Check_File_IO_Performance:

USE [Admin_DB]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [dbo].[Check_File_IO_Performance]
 @IOLATENCY_WARN_LEVEL INT = 500,				--DETERMINES READ OR WRITE LATENCY IN MILLISECOND BEFORE SENDING WARNING EMAIL
 @EMAIL_LIST VARCHAR(100) = 'GEORGE.L@GLINDBA.COM'
AS
BEGIN
SET NOCOUNT ON;

DECLARE @REALSUBJECT VARCHAR(200)
DECLARE @ATTACHED_FILENAME VARCHAR(200)
DECLARE @BODY_TEXT VARCHAR(8000)
DECLARE @WARNING_QUERY VARCHAR(2000)
DECLARE @RECORDS_COUNT INT
DECLARE @RUNTIME DATETIME = GETDATE()

SELECT [DB_NAME]
      ,[FILE_NAME]
      ,[SIZE_GB]
      ,[THROUGHPUT_MBPS]
	  ,CASE
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 128 THEN 100
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 128 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 256 THEN 125
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 256 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 512 THEN 150
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 512 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 1024 THEN 200
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 1024 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 2048	THEN 250
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 2048 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 4096	THEN 250
			ELSE 250
			END AS [MAX_THROUGHPUT]
      ,[READ_MBPS]
      ,[WRITE_MBPS]
      ,[IOPS]
	  ,CASE
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 128 THEN 500
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 128 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 256 THEN 1100
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 256 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 512 THEN 2300
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 512 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 1024 THEN 5000
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 1024 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 2048	THEN 7500
			WHEN  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  > 2048 AND  CAST(SIZE_GB * 8. / 1024 / 1024 AS DECIMAL(12,4))  <= 4096	THEN 7500
			ELSE 7500
			END AS [MAX_IOPS]
      ,[READ_IOPS]
      ,[WRITE_IOPS]
      ,[LATENCY_MS]
      ,[READ_LATENCY_MS]
      ,[WRITE_LATENCY_MS]
      ,[READ_IO_LATENCY_MS]
      ,[WRITE_IO_LATENCY_MS]
      ,[KB_PER_READ]
      ,[KB_PER_WRITE]
      ,[KB_PER_IO]
      ,[READ_MB]
      ,[WRITE_MB]
      ,[NUM_OF_READS]
      ,[NUM_OF_WRITES]
      ,[INTERVAL_MI]
      ,[TYPE]
	  ,@RUNTIME AS [COLLECTION_TIME]
  INTO ##SLOW_IO_FILES
  FROM Admin_DB.[dbo].FN_FILE_STATS(null,null,null)
  WHERE WRITE_LATENCY_MS > @IOLATENCY_WARN_LEVEL OR READ_LATENCY_MS > @IOLATENCY_WARN_LEVEL

SELECT @RECORDS_COUNT=COUNT(*) FROM ##SLOW_IO_FILES

IF  @RECORDS_COUNT != 0
BEGIN
	
INSERT INTO Admin_DB.[DBO].[SLOW_IO_FILES_HISTORY]
           ([DB_NAME]
           ,[FILE_NAME]
           ,[SIZE_GB]
           ,[THROUGHPUT_MBPS]
           ,[MAX_THROUGHPUT]
           ,[READ_MBPS]
           ,[WRITE_MBPS]
           ,[IOPS]
           ,[MAX_IOPS]
           ,[READ_IOPS]
           ,[WRITE_IOPS]
           ,[LATENCY_MS]
           ,[READ_LATENCY_MS]
           ,[WRITE_LATENCY_MS]
           ,[READ_IO_LATENCY_MS]
           ,[WRITE_IO_LATENCY_MS]
           ,[KB_PER_READ]
           ,[KB_PER_WRITE]
           ,[KB_PER_IO]
           ,[READ_MB]
           ,[WRITE_MB]
           ,[NUM_OF_READS]
           ,[NUM_OF_WRITES]
           ,[INTERVAL_MI]
           ,[TYPE]
           ,[COLLECTION_TIME])
     SELECT
           [DB_NAME]
           ,[FILE_NAME]
           ,[SIZE_GB]
           ,[THROUGHPUT_MBPS]
           ,[MAX_THROUGHPUT]
           ,[READ_MBPS]
           ,[WRITE_MBPS]
           ,[IOPS]
           ,[MAX_IOPS]
           ,[READ_IOPS]
           ,[WRITE_IOPS]
           ,[LATENCY_MS]
           ,[READ_LATENCY_MS]
           ,[WRITE_LATENCY_MS]
           ,[READ_IO_LATENCY_MS]
           ,[WRITE_IO_LATENCY_MS]
           ,[KB_PER_READ]
           ,[KB_PER_WRITE]
           ,[KB_PER_IO]
           ,[READ_MB]
           ,[WRITE_MB]
           ,[NUM_OF_READS]
           ,[NUM_OF_WRITES]
           ,[INTERVAL_MI]
           ,[TYPE]
           ,[COLLECTION_TIME]
	FROM ##SLOW_IO_FILES

        SET @WARNING_QUERY='SET NOCOUNT ON;
			SELECT CAST(DB_NAME AS VARCHAR(15)) AS DB_NAME
				  ,CAST(FILE_NAME AS VARCHAR(20)) AS FILE_NAME
				  ,CAST(CAST(SIZE_GB AS INT) AS VARCHAR(7)) AS SIZE_GB
				  ,CAST(THROUGHPUT_MBPS AS VARCHAR(15)) THROUGHPUT_MBPS
				  ,MAX_THROUGHPUT
				  ,CAST(IOPS AS VARCHAR(7)) AS IOPS
				  ,MAX_IOPS
				  ,CAST(READ_IOPS AS VARCHAR(9)) AS READ_IOPS
				  ,CAST(WRITE_IOPS AS VARCHAR(10)) AS WRITE_IOPS
				  ,CAST(LATENCY_MS AS VARCHAR(10)) AS LATENCY_MS
				  ,CAST(READ_LATENCY_MS AS VARCHAR(12)) AS READ_LATENCY_MS
				  ,CAST(WRITE_LATENCY_MS AS VARCHAR(12)) AS WRITE_LATENCY_MS
				  ,CAST(READ_IO_LATENCY_MS AS VARCHAR(10)) AS READ_IO_MS
				  ,CAST(WRITE_IO_LATENCY_MS AS VARCHAR(11)) AS WRITE_IO_MS
				  ,CAST(NUM_OF_READS AS VARCHAR(12)) AS NUM_OF_READS
				  ,CAST(NUM_OF_WRITES AS VARCHAR(13)) AS NUM_OF_WRITES
			  FROM ##SLOW_IO_FILES'

-- PRINT @WARNING_QUERY

        SET @BODY_TEXT='Slow file I/O detected, The threshold set to '+CAST(@IOLATENCY_WARN_LEVEL AS VARCHAR(10))+'ms. Check the file attached for details. The past alerts can be found in this table Admin_DB.DBO.SLOW_IO_FILES_HISTORY'

        SELECT @REALSUBJECT= 'WARNING: FILE I/O LATENCY IS ABOVE ' + CAST(@IOLATENCY_WARN_LEVEL AS VARCHAR(10)) + 'MS ON ' + @@SERVERNAME
        SET @ATTACHED_FILENAME='SLOW_FILE_IO_'+REPLACE(REPLACE(CONVERT(VARCHAR,GETDATE(),120),' ','-'),':','')+'.TXT'
        EXEC MSDB.DBO.SP_SEND_DBMAIL 
        @RECIPIENTS=@EMAIL_LIST,
        @SUBJECT= @REALSUBJECT,
        @BODY=@BODY_TEXT
        ,@QUERY=@WARNING_QUERY
		,@QUERY_RESULT_WIDTH = 2000
        ,@EXECUTE_QUERY_DATABASE='Admin_DB',
        @ATTACH_QUERY_RESULT_AS_FILE=1,
        @QUERY_ATTACHMENT_FILENAME=@ATTACHED_FILENAME,
        @QUERY_RESULT_HEADER=1
		--,@EXCLUDE_QUERY_OUTPUT = 1
END

DROP TABLE ##SLOW_IO_FILES
END

GO

Step 5: Create A SQL Agent job to monitor the real time file IO performance

The job is scheduled to run every 5 minutes and has two steps:


The first Step run this:

EXEC dba_admin.dbo.snapshot_file_stats;
GO

The second step run this:

DECLARE @IOLATENCY_WARN_LEVEL INT = 500 
DECLARE @EMAIL_LIST VARCHAR(100) = 'george.l@glindba.com' 

EXEC dba_admin.[dbo].[Check_File_IO_Performance] @IOLATENCY_WARN_LEVEL=@IOLATENCY_WARN_LEVEL, @EMAIL_LIST=@EMAIL_LIST

Below is the complete script of creating the SQL Agent job:

USE [msdb]
GO

BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0

IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'DBA Maintenance' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'DBA Maintenance'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'DBA-Monitor-File-IO-Performance', 
		@enabled=0, 
		@notify_level_eventlog=0, 
		@notify_level_email=0, 
		@notify_level_netsend=0, 
		@notify_level_page=0, 
		@delete_level=0, 
		@description=N'No description available.', 
		@category_name=N'DBA Maintenance', 
		@job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
 
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Take-File-IO-Stats-Snapshot', 
		@step_id=1, 
		@cmdexec_success_code=0, 
		@on_success_action=3, 
		@on_success_step_id=0, 
		@on_fail_action=2, 
		@on_fail_step_id=0, 
		@retry_attempts=0, 
		@retry_interval=0, 
		@os_run_priority=0, @subsystem=N'TSQL', 
		@command=N'EXEC Admin_DB.dbo.snapshot_file_stats;
GO
', 
		@database_name=N'Admin_DB', 
		@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
 
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'CheckFileIOPerformance', 
		@step_id=2, 
		@cmdexec_success_code=0, 
		@on_success_action=1, 
		@on_success_step_id=0, 
		@on_fail_action=2, 
		@on_fail_step_id=0, 
		@retry_attempts=0, 
		@retry_interval=0, 
		@os_run_priority=0, @subsystem=N'TSQL', 
		@command=N'DECLARE @IOLATENCY_WARN_LEVEL INT = 500 
DECLARE @EMAIL_LIST VARCHAR(100) = ''SaaSOpsDatabaseAdmins@pointclickcare.com;george.l@pointclickcare.com'' 

EXEC Admin_DB.[dbo].[Check_File_IO_Performance] @IOLATENCY_WARN_LEVEL=@IOLATENCY_WARN_LEVEL, @EMAIL_LIST=@EMAIL_LIST', 
		@database_name=N'Admin_DB', 
		@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Daily every 5 minute', 
		@enabled=1, 
		@freq_type=4, 
		@freq_interval=1, 
		@freq_subday_type=4, 
		@freq_subday_interval=5, 
		@freq_relative_interval=0, 
		@freq_recurrence_factor=0, 
		@active_start_date=20200824, 
		@active_end_date=99991231, 
		@active_start_time=0, 
		@active_end_time=235959, 
		@schedule_uid=N'7de0ce66-bfe0-475a-a490-26c3e4441a1c'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO

Note: depending on how often you run the SQL Agent job and the retention period you chose when creating the temporal table, the temporal history table, dbo.IO_VIRTUAL_FILE_STATS_SNAPSHOT_HISTORY, may grow very quickly. It's better to add one more step to the SQL Agent job to purge the old records in the history table.








1 comment

Recent Posts

See All

Azure SQL How To - SQL Managed Instance

Task 1: Copy encryption certificate on SQL Server to SQL MI On the source SQL Server machine, in a query window, run script below to...

1 Comment


Insightful article on SQL assignments! Mastering SQL programming can be challenging, and your comprehensive guide provides valuable insights. For those seeking additional help, I recommend checking out MyAssignmentServices.co.uk for expert SQL Assignment Help. Their assistance has proven invaluable for many.

Like
bottom of page