Tuesday, October 23, 2012

Capture Index Usage Statistics for Better Analysis

Starting with SQL Server 2005, Microsoft introduced Dynamic Management Views to help DBAs see what's going inside of SQL Server.  One of my favorite is the sys.dm_db_index_usage_stats.  It collects statistics for different types of index operations and the timestamp at which it was last performed.   All the DMVs are great tools; however, they all have the same drawback.  They are memory resident, which means the statistics are reset any time the SQL Server service is restarted, the database is attached/detached, or it's restored.

The best way to avoid losing this valuable data is to store it in a persistent table.  I typically store data like this within the msdb database, but you can store in any database.  Below we have created a table DBA.IndexUsageStats to hold each snapshot of the data.


USE msdb;
GO
CREATE SCHEMA dba AUTHORIZATION dbo;
GO
CREATE TABLE dba.IndexUsageStats
  (
     SQLStartTime       DATETIME NULL,
     CaptureTime        DATETIME NULL,
     DatabaseName       NVARCHAR(128) NULL,
     ObjectName         NVARCHAR(128) NULL,
     index_name         NVARCHAR(128) NULL,
     index_type_desc    NVARCHAR(60) NULL,
     index_is_unique    BIT NULL,
     database_id        SMALLINT NULL,
     object_id          INT NULL,
     index_id           INT NULL,
     user_seeks         BIGINT NULL,
     user_scans         BIGINT NULL,
     user_lookups       BIGINT NULL,
     user_updates       BIGINT NULL,
     last_user_seek     DATETIME NULL,
     last_user_scan     DATETIME NULL,
     last_user_lookup   DATETIME NULL,
     last_user_update   DATETIME NULL,
     system_seeks       BIGINT NULL,
     system_scans       BIGINT NULL,
     system_lookups     BIGINT NULL,
     system_updates     BIGINT NULL,
     last_system_seek   DATETIME NULL,
     last_system_scan   DATETIME NULL,
     last_system_lookup DATETIME NULL,
     last_system_update DATETIME NULL
  );
GO

Next we need to create the collector.  This query will pull the current data from the DMV and store it in our table.


DECLARE @CurrentStartTime  AS DATETIME,
        @PreviousStartTime AS DATETIME;

SELECT @CurrentStartTime = sqlserver_start_time
FROM   master.sys.dm_os_sys_info;

USE TestDatabase;

INSERT msdb.DBA.IndexUsageStats
SELECT @CurrentStartTime          AS SQLStartTime,
       CURRENT_TIMESTAMP          AS CaptureTime,
       Db_name()                  AS DatabaseName,
       Object_name(ius.object_id) AS ObjectName,
       i.name                     AS IndexName,
       i.type_desc                AS IndexTypeDesc,
       i.is_unique                AS IndexIsUnique,
       ius.*
FROM   sys.dm_db_index_usage_stats ius
       INNER JOIN sys.indexes i
               ON ius.object_id = i.object_id
                  AND ius.index_id = i.index_id
WHERE  ius.database_id = Db_id()
ORDER  BY ObjectName,
          ius.index_id;
GO

The script can be scheduled with SQL Agent to run on a regular basis.  I prefer a weekly schedule, but any recurring schedule is fine.  If you have more than one user database, then you'll need to run this script for each one.

In the script we capture a bit more than just the DMV data. For example:
  1. SQLStartTime - The time when the SQL Server service started.
  2. CaptureTime - The time when the script captured a snapshot of the DMV data.
  3. DatabaseName, OjbectName, IndexName, IndexTypeDesc, IndexIsUnique are pretty self-explanatory.
Once you have let this job execute over a few weeks, you can then revisit the data and start seeing how your indexes are being utilized over a longer period of time.  This query will aggregate the index statistics using the captured data.


USE msdb;
SELECT Min(SQLStartTime) AS SQLStartTime,
       Max(CaptureTime)  AS CaptureTime,
       ObjectName,
       IndexName,
       IndexTypeDesc,
       index_id,
       Sum(user_seeks)   AS user_seeks,
       Sum(user_scans)   AS user_scans,
       Sum(user_lookups) AS user_lookups,
       Sum(user_updates) AS user_updates
FROM   DBA.IndexUsageStats
WHERE  DatabaseName = 'TestDatabase'
GROUP  BY ObjectName,
          index_name,
          index_type_desc,
          index_id
ORDER  BY ObjectName,
          index_id;
GO

These samples below were taken over a four month period and clearly show how the indexes have been utilized.  It even points out some indexes (IX_ItemCodeID and IX_PurchasedBy) that may be nothing more than overhead and could be candidates for removal.  These two indexes have not been used for any seeks or scans, but SQL still needs to update them every time DML code is applied to the Purchases table.


As you can see, the more data you have to draw a conclusion about your index usage, the better you can make tuning recommendations for each of them.