Collecting Historical Wait Statistics

As a DBA, I’m sure you’ve heard many times to always check the sys.dm_os_wait_stats DMV to help diagnose performance issues on your server. The DMV returns information about specific resources SQL Server had to wait for while processing queries. The counters in the DMV are cumulative since the last time SQL Server was started and the counters can only be reset by a service restart or by using a DBCC command. Since DMVs don’t persist their data beyond a service restart, we need to come up with a way to collect this data and be able to run trending reports over time.

Collecting the data seems easy enough by simply selecting all rows into a permanent table. However, that raw data won’t help us determine the time in which a particular wait type occurred. Think about it for a minute. If the raw data for the counters is cumulative, then how can you tell if a bunch of waits occurred within a span of a few minutes or if they occurred slowly over the past 6 months that SQL Server has been running. This is where we need to collect the data in increments.

First, we need to create a history table to store the data. The table will store the wait stat values as well as the difference (TimeDiff_ms, WaitingTasksCountDiff, WaitTimeDiff_ms, SignalWaitTimeDiff_ms) in those values between collection times.

CREATE TABLE dbo.WaitStatsHistory
(
SqlServerStartTime DATETIME NOT NULL
,CollectionTime DATETIME NOT NULL
,TimeDiff_ms INT NOT NULL
,WaitType NVARCHAR(60) NOT NULL
,WaitingTasksCountCumulative BIGINT NOT NULL
,WaitingTasksCountDiff INT NOT NULL
,WaitTimeCumulative_ms BIGINT NOT NULL
,WaitTimeDiff_ms INT NOT NULL
,MaxWaitTime_ms BIGINT NOT NULL
,SignalWaitTimeCumulative_ms BIGINT NOT NULL
,SignalWaitTimeDiff_ms INT NOT NULL
,CONSTRAINT PK_WaitStatsHistory PRIMARY KEY CLUSTERED (CollectionTime, WaitType)
)WITH (DATA_COMPRESSION = PAGE);
GO

Next, we need to get a couple of timestamps when we collect each sample. The first will be the SQL Server start time. We need the SQL Server start time, so we can identify when the service was restarted.

SELECT @CurrentSqlServerStartTime = sqlserver_start_time FROM sys.dm_os_sys_info;
GO

The second set is the previous start time and previous collection time, if they exist in the history table.

SELECT
@PreviousSqlServerStartTime = MAX(SqlServerStartTime)
,@PreviousCollectionTime = MAX(CollectionTime)
FROM msdb.MSDBA.WaitStatsHistory;
GO

The last timestamp is the collection time. We’ll also use this timestamp to calculate the difference in wait stat values between each collection.

SELECT GETDATE() AS 'CollectionTime',* FROM sys.dm_os_wait_stats;
GO

We need to compare the current SQL Server start time to the previous start time from the history table. If they don’t equal, then we assume the server was restarted and insert “starter” values. I call them starter values, because we just collect the current wait stat values and insert 0 for each of the diff columns.

IF @CurrentSqlServerStartTime <> ISNULL(@PreviousSqlServerStartTime,0)
BEGIN
-- Insert starter values if SQL Server has been recently restarted
INSERT INTO dbo.WaitStatsHistory
SELECT
@CurrentSqlServerStartTime
,GETDATE()
,DATEDIFF(MS,@CurrentSqlServerStartTime,GETDATE())
,wait_type
,waiting_tasks_count
,0
,wait_time_ms
,0
,max_wait_time_ms
,signal_wait_time_ms
,0
FROM sys.dm_os_wait_stats;
END
GO

If the timestamps are the same, we will collect the current wait stats and calculate the difference (in milliseconds) in collection time as well as the difference in values.

INSERT msdb.MSDBA.WaitStatsHistory
SELECT
@CurrentSqlServerStartTime
,cws.CollectionTime
,DATEDIFF(MS,@PreviousCollectionTime,cws.CollectionTime)
,cws.wait_type
,cws.waiting_tasks_count
,cws.waiting_tasks_count - hist.WaitingTasksCountCumulative
,cws.wait_time_ms
,cws.wait_time_ms - hist.WaitTimeCumulative_ms
,cws.max_wait_time_ms
,cws.signal_wait_time_ms
,cws.signal_wait_time_ms - hist.SignalWaitTimeCumulative_ms
FROM CurrentWaitStats cws INNER JOIN MSDBA.WaitStatsHistory hist
ON cws.wait_type = hist.WaitType
AND hist.CollectionTime = @PreviousCollectionTime;
GO

You could filter the collection to only specific wait stat counters that you want to track by just a where clause, but I prefer to collect them all and then filter at the reporting end.

At this point, we’re ready to schedule the job. The script could be run at any interval, but I usually leave it to collect data once a day. If I notice a spike in a specific wait stat counter, then I could easily increase the job frequency to once every few hours or even once an hour. Having those smaller, more granular data samples will allow us to isolate which time frame we need to concentrate on.

For example, if we notice the CXPACKET wait suddenly spikes when collecting the data each day, then we could schedule the collection every hour to see if it’s happening during a specific window.

SELECT * FROM msdb.MSDBA.WaitStatsHistory
WHERE WaitType = 'CXPACKET';
GO

Finally, we can use Excel to format this raw data into an easy to read chart.

From this chart, we can see at 5PM there was a spike in CXPACKET waits, but a low number of tasks that waited. In this case, I would be assume there is a single process running in parallel that caused these waits and from there I could dig further into finding the individual query.
Data compression is enabled on this table to help keep it small. It can easily turn it off for the table by removing WITH (DATA_COMPRESSION = PAGE) from the CREATE TABLE statement. However, with page compression enabled, 24 collections (one per hour) only takes up 775KB of space. Without compression, the same sample of data consumes about 2.2MB. If you plan to keep a lot of history, then it’s best to leave page compression enabled.
Hopefully, this script will help you keep track of your historical wait statistics, so you can have better knowledge of what has happened to your environment over time. The entire script is posted below. If you want to read further into what each wait statistics means, then check out Paul Randal’s article about wait stats. Additionally, if you want more info on using DMVs, check out Glenn Berry’s diagnostic queries.
/***********************************************
Create the historical table
***********************************************/

USE msdb;
GO

-- Create the history table if it does not exist
IF OBJECT_ID('dbo.WaitStatsHistory') IS NULL
BEGIN
CREATE TABLE dbo.WaitStatsHistory
(
SqlServerStartTime DATETIME NOT NULL
,CollectionTime DATETIME NOT NULL
,TimeDiff_ms INT NOT NULL
,WaitType NVARCHAR(60) NOT NULL
,WaitingTasksCountCumulative BIGINT NOT NULL
,WaitingTasksCountDiff INT NOT NULL
,WaitTimeCumulative_ms BIGINT NOT NULL
,WaitTimeDiff_ms INT NOT NULL
,MaxWaitTime_ms BIGINT NOT NULL
,SignalWaitTimeCumulative_ms BIGINT NOT NULL
,SignalWaitTimeDiff_ms INT NOT NULL
,CONSTRAINT PK_WaitStatsHistory PRIMARY KEY CLUSTERED (CollectionTime, WaitType)
)WITH (DATA_COMPRESSION = PAGE);
END
GO

/***********************************************
Schedule this section as an on-going job
***********************************************/

DECLARE
@CurrentSqlServerStartTime DATETIME
,@PreviousSqlServerStartTime DATETIME
,@PreviousCollectionTime DATETIME;

SELECT @CurrentSqlServerStartTime = sqlserver_start_time FROM sys.dm_os_sys_info;

-- Get the last collection time
SELECT
@PreviousSqlServerStartTime = MAX(SqlServerStartTime)
,@PreviousCollectionTime = MAX(CollectionTime)
FROM msdb.dbo.WaitStatsHistory;

IF @CurrentSqlServerStartTime <> ISNULL(@PreviousSqlServerStartTime,0)
BEGIN
-- Insert starter values if SQL Server has been recently restarted
INSERT INTO dbo.WaitStatsHistory
SELECT
@CurrentSqlServerStartTime
,GETDATE()
,DATEDIFF(MS,@CurrentSqlServerStartTime,GETDATE())
,wait_type
,waiting_tasks_count
,0
,wait_time_ms
,0
,max_wait_time_ms
,signal_wait_time_ms
,0
FROM sys.dm_os_wait_stats;
END
ELSE
BEGIN
-- Get the current wait stats
WITH CurrentWaitStats AS
(
SELECT GETDATE() AS 'CollectionTime',* FROM sys.dm_os_wait_stats
)
-- Insert the diff values into the history table
INSERT msdb.dbo.WaitStatsHistory
SELECT
@CurrentSqlServerStartTime
,cws.CollectionTime
,DATEDIFF(MS,@PreviousCollectionTime,cws.CollectionTime)
,cws.wait_type
,cws.waiting_tasks_count
,cws.waiting_tasks_count - hist.WaitingTasksCountCumulative
,cws.wait_time_ms
,cws.wait_time_ms - hist.WaitTimeCumulative_ms
,cws.max_wait_time_ms
,cws.signal_wait_time_ms
,cws.signal_wait_time_ms - hist.SignalWaitTimeCumulative_ms
FROM CurrentWaitStats cws INNER JOIN dbo.WaitStatsHistory hist
ON cws.wait_type = hist.WaitType
AND hist.CollectionTime = @PreviousCollectionTime;
END
GO
Share