Category Archives: TSQLTuesday

Oh, The Free Stuff!

Ah, ShwagTSQL2sDay150x150[4]

 

I’ve been doing the whole conference thing for a while. Not as long as some but quite a while. Along the way I’ve collected quite a bit of stuff. I’ve also given away most of it to the local user group too.

 

Little Things

I always keep a few little things like ducks and note books.

SAM_0457

I also keep a few of the drink-ware items too.

SAM_0466

Wearable Stuff

Oh the humanity, the amount of clothing I have from conferences is just crazy. I give away quite a bit but always try to keep one shirt or item from everywhere I go.

SAM_0465

Carry All Your Stuff

With all those shirts I need all these bags to carry them! I’ve kept some of the bags I’ve been given over the years. My two favorite are the Idera bag I got from the ACE program (which I carry right now) and the SQL Pass summit 2003 bag, which gave up the ghost two years ago when all the stitching for the bottom finally gave way.

SAM_0456

Valuable Stuff

In 2004-05 when Microsoft REALLY loved SQL Server they were giving away crazy amounts of software. I gave away over one hundred copies of SQL Server 2005 Developer edition and around a dozen Windows 2003 Enterprise and SQL Server 2005 Standard edition copies too.

SAM_0458

 

Special Stuff, Never To Come Again

I also had the privilege of attending the last great launch event for SQL Server 2005 in San Francisco.

Found This in my drink:

SAM_0460

Oh, more software!

SAM_0459

Ok, I’m impressed now.

SAM_0461

Lastly,

The greatest thing I get from each and every one of these events is meeting people like you! See you at the PASS Summit.

 

SQLDIY: Gather Virtual File Statistics Using T-SQL #TSQL2sDay 15

It’s that time again, T-SQL Tuesday is here! This time Pat Wright (blog|twitter) is hosting and has put forth automating tasks using ether T-SQL or Powershell. I LOVE automating stuff. As a production DBA in some very large shops you can’t do your job unless you make your servers work for you. I’ve been using T-SQL and *GASP* xp_cmdshell, OSQL and file shares to gather stats and push configurations to servers for decades. Log before fancy things like C# and Powershell existed. These days I use a variety of home grown tools but doing things with just T-SQL can be just as powerful. I’m going to leverage this post to start a series on a pure T-SQL implementation of configuration management, data gathering and utility procedure deployment.

Where Is Your Management Database?

Every DBA should have two things, a utility, or management database on every server and a central repository where all the locally collected data in the management database is pulled back to. What surprises people when I talk about this methodology is I don’t always advocate using a licensed copy of SQL Server. By that I mean that SQL Express works just great as a central repository server. I usually put this on a virtual machine, Hyper-V or what ever flavor of virtual environment your company supports. This allows you to do things like enable CLR and xp_cmdshell on a server that technically is non-production and keep your security risk much lower. Every server that is deployed in my shop gets a local management database. From the central repository I can push out everything I need to manage a server and keep all my servers up to date as I make improvements or bug fixes to the management code. That’s all I’m really going to say about that in this post though I just wanted to give you an idea of just how deep the rabbit hole can go.

DMV’s give you the information, you have to use it.

Since SQL Sever 2005 Microsoft let of of the black box mentality and started providing crazy useful information via Dynamic Management Views. Virtual file statistics though have been around for quite a while. They got a touch up in the DMV but the basic information was available in SQL Server 2000 via function call.  The DMV I’m after is sys.dm_io_virtual_file_stats. It has a ton of information in it. It’s main problem though is it is an aggregation over time and doesn’t really tell you what is different from yesterday to today. To get around that we have to build our own sampling routine.

The VirtualFileStats Table

We create a table in our local management database to collect the daily numbers. I try to keep things simple.


CREATE TABLE dbo.VirtualFileStats
  (
     RecordID                             INT IDENTITY(1,1) NOT NULL,
     ServerName                           VARCHAR(255) NOT NULL,
     DBID                                 INT NOT NULL,
     FileID                               INT NOT NULL,
     Reads                                BIGINT NULL,
     ReadsFromStart                       BIGINT NULL,
     Writes                               BIGINT NULL,
     WritesFromStart                      BIGINT NULL,
     BytesRead                            BIGINT NULL,
     BytesReadFromStart                   BIGINT NULL,
     BytesWritten                         BIGINT NULL,
     BytesWrittenFromStart                BIGINT NULL,
     IostallInMilliseconds                BIGINT NULL,
     IostallInMillisecondsFromStart       BIGINT NULL,
     IostallReadsInMilliseconds           BIGINT NULL,
     IostallReadsInMillisecondsFromStart  BIGINT NULL,
     IostallWritesInMilliseconds          BIGINT NULL,
     IostallWritesInMillisecondsFromStart BIGINT NULL,
     RecordedDateTime                     DATETIME NULL,
     IntervalInMilliseconds               BIGINT NULL,
     FirstMeasureFromStart                BIT NULL
  )
GO
CREATE TABLE dbo.VirtualFileStatsHistory
  (
     RecordID                             INT NOT NULL,
     ServerName                           VARCHAR(255) NOT NULL,
     DBID                                 INT NOT NULL,
     FileID                               INT NOT NULL,
     Reads                                BIGINT NULL,
     ReadsFromStart                       BIGINT NULL,
     Writes                               BIGINT NULL,
     WritesFromStart                      BIGINT NULL,
     BytesRead                            BIGINT NULL,
     BytesReadFromStart                   BIGINT NULL,
     BytesWritten                         BIGINT NULL,
     BytesWrittenFromStart                BIGINT NULL,
     IostallInMilliseconds                BIGINT NULL,
     IostallInMillisecondsFromStart       BIGINT NULL,
     IostallReadsInMilliseconds           BIGINT NULL,
     IostallReadsInMillisecondsFromStart  BIGINT NULL,
     IostallWritesInMilliseconds          BIGINT NULL,
     IostallWritesInMillisecondsFromStart BIGINT NULL,
     RecordedDateTime                     DATETIME NULL,
     IntervalInMilliseconds               BIGINT NULL,
     FirstMeasureFromStart                BIT NULL
  )

This is what we need to gather, and later analyze the data. Since we are managing our samples we have to know when the sampling started and what the first sample is. FirstMeasureFromStart lets us know that it is the first base measurements the rest of the samples will delta off of.

GatherVirtualFileStats Stored Procedure

Next we need a stored procedure to do the sampling. One thing you will notice is the procedure executes continuously with a WAIT FOR DELAY built into it so you can get finer grained than the 1 minute limitation of the SQL Agent. Sometimes, I will do one off sampling for a short period, say 30 minutes at a 30 second interval but most often I just let it run and set the sample rate at 1 minute or larger depending on how busy the system is.


IF EXISTS (SELECT
             *
           FROM
             dbo.sysobjects
           WHERE
            id = Object_id(N'[dbo].[GatherVirtualFileStats]')
            AND Objectproperty(id, N'IsProcedure') = 1)
  DROP PROCEDURE [dbo].[GatherVirtualFileStats]

GO

--------------------------------------------------------------------------------------
--  GatherVirtualFileStats
--  by: Wesley D. Brown
--  date: 02/08/2011
--  mod:  04/14/2011
--  mod:  04/17/2013

--  description:
--	This stored procedure is used to sample sys.dm_io_virtual_file_stats to track
--	performance at a database file level. This is useful for finding
--	hotspots on SAN's or under performing IO systems.

--  parameters:
--    @Duration	 = '01:00:00' How long to run before exiting
--   @IntervalInSeconds = 120 Number of seconds between samples
--@DB			 = -1 DB_ID to monitor, -1 for all
--@DBFile		 = -1 File_ID of file to monitor, -1 for all
--  usage:
--  	DECLARE @RC         INT,
--			@StartTime  DATETIME,
--			@databaseID INT

--	SELECT @StartTime = Getdate(),
--		   @databaseID = Db_id()

--	EXEC @RC = Gathervirtualfilestats
--	  '00:45:30',
--	  30,
--	  10,
--	  -1

--	SELECT *
--	FROM   dbo.VirtualFileStats
--	WHERE  DBID = 10
--	ORDER  BY RecordID

--  platforms:
--  SQL Server 2005
--  SQL Server 2008
--  SQL Server 2008 R2
--  SQL Server 2012
--  tested:
--  SQL Server 2005 SP2
--  SQL Server 2012
---------------------------------------------------------------------------------------
--  *** change log		***
--	Added history table and perge on start up if there is data in the main table
--  *** end change log	***
-------------------------------------------------------------------------------------
CREATE PROC dbo.Gathervirtualfilestats
  @Duration          DATETIME = '01:00:00',
  @IntervalInSeconds INT = 120,
  @DB                INT = -1,
  @DBFile            INT = -1
AS
  SET nocount ON

  DECLARE
    @StopTime                 DATETIME,
    @LastRecordedDateTime     DATETIME,
    @CurrentDateTime          DATETIME,
    @ErrorNumber              INT,
    @NumberOfRows             INT,
    @ErrorMessageText         NVARCHAR(4000),
    @CurrentServerName        VARCHAR(255),
    @DifferenceInMilliSeconds BIGINT,
	@SQLVersion				  VARCHAR(50)

	select @SQLVersion =
		substring(version_number,1,charindex(' ',version_number))
		from
		(
		select
			substring(version,charindex('-',version)+2, charindex('(',version)-patindex('%.[0-9]',version)) as version_number from
			(
				select @@version as version
			) as t
	) as t

	if (charindex('11.',@SQLVersion,0) > 0)
	BEGIN
		IF (@DB = -1)
		BEGIN
			set @DB = NULL
		END
		IF (@DBFile = -1)
		BEGIN
			set @DBFile = NULL
		END
	END

  IF EXISTS (SELECT
               1
             FROM
               dbo.VirtualFileStats)
    BEGIN
        IF EXISTS (SELECT
                     *
                   FROM
                     dbo.sysobjects
                   WHERE
                    id = Object_id(N'[dbo].[VirtualFileStats]')
                    AND Objectproperty(id, N'IsTable') = 1)
          BEGIN
              INSERT INTO dbo.VirtualFileStatsHistory
              SELECT
                *
              FROM
                VirtualFileStats;

              TRUNCATE TABLE dbo.VirtualFileStats;
          END
    END

  SELECT
    @CurrentServerName = Cast(Serverproperty('servername') AS VARCHAR(255))

  SET @DifferenceInMilliSeconds = Datediff(ms, CONVERT(DATETIME, '00:00:00', 8), @Duration)

  SELECT
    @StopTime = Dateadd(ms, @DifferenceInMilliSeconds, Getdate())

  WHILE Getdate() <= @StopTime
    BEGIN
        SELECT
          @LastRecordedDateTime = @CurrentDateTime

        SELECT
          @CurrentDateTime = Getdate()

        INSERT INTO dbo.VirtualFileStats
                    (ServerName,
                     DBID,
                     FileID,
                     Reads,
                     ReadsFromStart,
                     Writes,
                     WritesFromStart,
                     BytesRead,
                     BytesReadFromStart,
                     BytesWritten,
                     BytesWrittenFromStart,
                     IostallInMilliseconds,
                     IostallInMillisecondsFromStart,
                     IostallReadsInMilliseconds,
                     IostallReadsInMillisecondsFromStart,
                     IostallWritesInMilliseconds,
                     IostallWritesInMillisecondsFromStart,
                     RecordedDateTime,
                     IntervalinMilliseconds,
                     FirstMeasureFromStart)
        SELECT
          @CurrentServerName,
          vfs.database_id,
          vfs.[file_id],
          vfs.num_of_reads - dbaf.ReadsFromStart                            AS Reads,
          vfs.num_of_reads                                                  AS ReadsFromStart,
          vfs.num_of_writes - dbaf.WritesFromStart                          AS Writes,
          vfs.num_of_writes                                                 AS WritesFromStart,
          vfs.num_of_bytes_read - dbaf.BytesReadFromStart                   AS BytesRead,
          vfs.num_of_bytes_read                                             AS BytesReadFromStart,
          vfs.num_of_bytes_written - dbaf.BytesWrittenFromStart             AS BytesWritten,
          vfs.num_of_bytes_written                                          AS BytesWrittenFromStart,
          vfs.io_stall - dbaf.IostallInMillisecondsFromStart                AS IostallInMilliseconds,
          vfs.io_stall                                                      AS IostallInMillisecondsFromStart,
          vfs.io_stall_read_ms - dbaf.IostallReadsInMillisecondsFromStart   AS IostallReadsInMilliseconds,
          vfs.io_stall_read_ms                                              AS IostallReadsInMillisecondsFromStart,
          vfs.io_stall_write_ms - dbaf.IostallWritesInMillisecondsFromStart AS IostallWritesInMilliseconds,
          vfs.io_stall_write_ms                                             AS IostallWritesInMillisecondsFromStart,
          @CurrentDateTime,
          CASE
            WHEN @LastRecordedDateTime IS NULL THEN NULL
            ELSE Datediff(ms, dbaf.RecordedDateTime, @CurrentDateTime)
          END                                                               AS IntervalInMilliseconds,
          CASE
            WHEN @LastRecordedDateTime IS NULL THEN 1
            ELSE 0
          END                                                               AS FirstMeasureFromStart
        FROM
          sys.Dm_io_virtual_file_stats(@DB, @DBFile) vfs
        LEFT OUTER JOIN VirtualFileStats dbaf
          ON vfs.database_id = dbaf.dbid
             AND vfs.[file_id] = dbaf.fileid
        WHERE
          ( @LastRecordedDateTime IS NULL
             OR dbaf.RecordedDateTime = @LastRecordedDateTime )

        SELECT
          @ErrorNumber = @@ERROR,
          @NumberOfRows = @@ROWCOUNT

        IF @ErrorNumber != 0
          BEGIN
              SET @ErrorMessageText = 'Error ' + CONVERT(VARCHAR(10), @ErrorNumber) + ' failed to insert file stats data!'

              RAISERROR (@ErrorMessageText,
                         16,
                         1) WITH LOG

              RETURN @ErrorNumber
          END

        WAITFOR DELAY @IntervalInSeconds
    END

I Have Data, Now What?

This is where the fun begins! If you just query the table the data doesn’t make much sense.

image

We need to do some simple math and get the data into a metric that is meaningful.


SELECT TOP 10 Db_name(dbid)                                          AS 'databasename',
              File_name(fileid)                                      AS 'filename',
              Reads / ( IntervalInMilliSeconds / 1000 )              AS 'readspersecond',
              Writes / ( IntervalInMilliSeconds / 1000 )             AS 'writespersecond',
              ( Reads + Writes ) / ( IntervalInMilliSeconds / 1000 ) AS 'iopersecond',
              CASE
                WHEN ( Reads / ( IntervalInMilliSeconds / 1000 ) ) > 0
                     AND IostallReadsInMilliseconds > 0 THEN IostallReadsInMilliseconds / Reads
                ELSE 0
              END                                                    AS 'iolatencyreads',
              CASE
                WHEN ( Reads / ( IntervalInMilliSeconds / 1000 ) ) > 0
                     AND IostallWritesInMilliseconds > 0 THEN IostallWritesInMilliseconds / Writes
                ELSE 0
              END                                                    AS 'iolatencywrites',
              CASE
                WHEN ( ( Reads + Writes ) / ( IntervalInMilliSeconds / 1000 ) > 0
                       AND IostallInMilliseconds &gt; 0 ) THEN IostallInMilliseconds / ( Reads + Writes )
                ELSE 0
              END                                                    AS 'iolatency',
              RecordedDateTime
FROM   management.dbo.VirtualFileStats
WHERE  DBID = 10
       AND FirstMeasureFromStart = 0
ORDER  BY RecordID

This gives us reads, writes and io latency per second results.

image

Now we are cooking! We can now see that on this database we are seeing some spikes in latency, the number of milliseconds it takes to complete a single IO request, and may warrant investigation. As a general rule of thumb if I see IO latency above 20 milliseconds consistently I start looking deeper into the IO system to see what is wrong. A single modern hard disk is capable of 130 random IO’s a second. Another thing to consider is how many databases are on the same disk. I will pull the database files together with the volume they are on to get a true aggregate number of IO’s and latencies. You may find that a single database is dominating the disks and causing other databases to slow down even when the number of IO’s for those databases is small. On a SAN being able to get to the file level can help you locate the physical disks for that LUN and help your SAN administrators look at a very specific spot instead of the overall health of the SAN and actually fix your performance issues.

Lastly, I run a nightly job that performs these aggregations for me and moves them into a table that I keep long term so I can see performance over time. This is a great way to see if you are getting near your IO capacity and if the steps you have done correct the issues.

So, That’s it?

Not by a long shot. Measuring IO latencies is just one piece of a much larger puzzle for troubleshooting performance problems. It is a valuable one though, and keeping this information over time is just priceless. Now that you have the tools to gather virtual file statistics on your SQL Server instance I’d love to hear how you are using the data.

Update 4/14/2011

I added a history table an a check at the beginning to move all data to history before starting up. Mike Kaplan reported below that multiple runs caused issues if there was data in the table from a previous run.

Update 4/17/2013

Added a check for the version of SQL Server. In 2012 they changed the syntax so calling the dmv with a -1,-1 simply gives you no data. Previously, it returned all database files.

A New Year, Time To Get My Learn On!

altMy friend, and newly minted MVP, Jen McCown (blog|twitter) is hosting  T-SQL Tuesday this month. I normally don’t do the T-SQL Tuesday, but this is a post about resolutions so I am resolving to do more of them in the new year. I’ve always considered myself a core engine kind of guy. Focused on things like I/O, on disk structures and the optimizer. As SQL Server grows it has become harder and harder to keep up with the Joneses’ (Steve?). Like every other DBA that earned their salt in the mines of 6.0 through 2000 I’ve got a ton of tools I’ve written myself to monitor every aspect of SQL Server. I’ve also not kept up as well with other technologies like replication that have an impact on my day to day life at work. So, with all that in mind this is my list of technical things to “get good at” this year.

SQL Server Stuff

1. Partitioning
I’ve worked on and off with partitioning since SQL Server 2000 and have a solid grasp of how it works and what the benefits are, just not at a deep enough level. I found myself telling someone that was just flat wrong a few weeks ago about partitioning in SQL Server 2008 R2. It wasn’t a huge deal but if I’m spreading misinformation I’m doing my community and myself harm.

2. Change Data Capture
Again, I’m using a throw back from the good old days. Using triggers to capture change data and insert that into history tables. It’s time to get rid of this performance robber and move to something a tad more modern. I sat in a couple of sessions at PASS and think if I can rap my head around it we could see a large improvement in performance.

3. Replication
I feel pretty good about my replication skills but it is becoming very important to our infrastructure and I need to make sure that I’m not making assumptions that aren’t true anymore and that I’ve configured my environment as best as I can. I sat in on Kendal Van Dyke’s (blog|twitter) replication at PASS and came away with a couple of things that I need to do and some additional things to research.

4. Analysis Services Administration
We are moving full boar into the BI stack. It isn’t my job to write MDX but I have to make sure that the performance from an infrastructure level is good and that we are prepared to recover if the need arises.

.Net Programming Stuff

1.LINQ
I do some stuff with LINQ including LINQ to SQL but I need to get better at it. Just trying to keep my mediocre c# skills from falling off completely.

2.Parallel Programming in .Net
I write a ton of multi-threaded stuff and have built up my own threading frameworks over the years. Microsoft is making a big push to make parallel programming easier. With PLINQ and the new Async framework that is in development right now. It isn’t a part of my core job skills but it is something I enjoy and use quite a bit.

3.Entity Framework
I don’t have to like it but it is everywhere I turn these days. I do get questions as well on how to optimize the database and without knowing exactly what is going on under the covers it limits my ability to troubleshoot issues.

90 Degrees From Center

1. Get Better With Python
I do more than Microsoft stuff and still enjoy learning new things. I recently picked up on Python and look to keep growing my skillset with 2.7 and 3.x lines. It is a fun language to program in. If you dig on easy to read and easy to write Python has that in spades.

2. Pick Up Lisp again
You heard me. Lisp the grandfather of modern functional languages. I worked with Lisp about a million years ago it seems. Apparently, it is back in vogue with several dialects floating around and quite a robust community supporting them. I can get a bit religious at times but passionate people make for a culture of innovation.