Author Archives: Wes Brown

About Wes Brown

Wes Brown is Vice President at Cantata Health and maintains his blog at http://www.sqlserverio.com. Previous experiences include Product Manager for SQL Litespeed by Quest software and consultant to fortune 500 companies. He specializes in high availability, disaster recovery and very large database performance tuning. He is a frequent speaker at local user groups and SQLSaturdays.

Idera SQL Doctor First Look

I recently saw a tweet about Idera’s new SQL Doctor tool that is currently in beta. This differs from other tools you may think of like Diagnostic Manager. DM and other tools like it gather some of the same information but are geared for real time alerts. This is more like the Best Practices Analyzer. It takes a look at several key points on your SQL Server instance and the OS. It makes recommendations on how to improve any problems it identifies. You may ask, does this do anything I can’t do on my own? No, it doesn’t. If you are doing this in house with your own tool set you may not need this, or any other tool like it. To be honest, you aren’t the target audience for SQL Doctor. This is really designed for shops that ether don’t have enough DBA’s to watch everything, or don’t have a full time DBA at all. That isn’t to say it has no use even in a highly monitored and optimized shop. I picked a horribly abused development box and let SQL Doctor tell me just how bad things were.

Idera_sql_doctor1

The interface is clean and simple. First thing we have to do is pick a server and connect.

Idera_sql_doctor2

Next it starts the interview process. It isn’t extremely in depth, and it isn’t suppose to be.

Idera_sql_doctor3

I would be a little worried if I didn’t know if the server was in production or not. But, the person running SQL Doctor may not be the DBA.

Idera_sql_doctor4

I’m sure they wrestled with this one. How do you describe OLTP workload vs. OLAP workload if the person running the tool doesn’t know if the server is production or not?

Idera_sql_doctor5

That’s pretty much all the questions at all. You can change some things in the settings tab like the databases

Idera_sql_doctor6

I’d say the time estimate can be a bit optimistic. It took over an hour to run on this box. The progress bar is a tad misleading as well since it spent the bulk of that time basically at 99% complete.

Idera_sql_doctor7

Looks scary! Some of these problems are. They do scale them according to problem and push the less likely stuff to the bottom for you. It is a lot of information to look at though. 600 recommendations, wow. Lots of index issues and query issues. Time to dig in.

Idera_sql_doctor8

They do a good job of grouping issues together. Like other tools in this category, sometimes the recommendations aren’t 100%. They throw the disk queue around quite a bit but not disk latency, which is really what we are worried about here. It does pick up some pretty nifty stuff that other tools I have looked at don’t really catch. It warned me about a NIC setting being optimized for file sharing and that it could put memory pressure on the box. It also hit on some other fun stuff at the OS level for memory like lock pages in memory that I bet plenty of folks may not be using. I was pretty happy with the level of OS recommendations over all. I am concerned with some of the recommendations that seem to be out of date. It recommended a file per CPU core on tempdb for example. I would recommend that they get a few more SQL Server folks to look at the recommendations and submit tweaks to them. It also will script the SQL Server changes for you or point you to Microsoft Kb’s to get them fixed up. Again, my worry with this kind of tool is someone is just going to blanket run every recommendation, and some of them like disabling the CLR could be detrimental if that feature is on for a reason. I know its hard to get everyone to read the warnings. Lets be honest though, if they don’t know if it is a production OLTP system they probably don’t know if the CLR is on for a reason ether.

I have to say, for a beta and for a tool of this type, I am impressed. With a little extra work I think it will be a worthy addition to any shop that is lacking in deep SQL Server expertise.

At The End of the IO Road With C#

Previously I’ve written about doing fun IO stuff in C#. I found out that some of my old tricks still worked in C# but….

Now having done a lot of C++ I knew about async IO buffered and un-buffered and could have made unmanaged code calls to open or create the file and pass the handle back, but just like it sounds it is kind of a pain to setup and if you are going down that path you might as well code it all up in C++ anyway.

I was mostly right. I have been working on a file sync tool for managing all my SQL Sever backup files. Naturally, I wanted to be as fast as humanly possible. Wanting that speed and getting it from the CLR are two completely different things. I know how to do asynchronous IO, and with a little trick, you can do un-buffered IO as well. The really crappy part is you can’t do both in the CLR.

From my previous post, you know that SQL Server does asynchronous, un-buffered IO on reads and writes. The CLR allows you to so asynchronous reads with a fun bit of coding and an call back structure. I took this code from one of the best papers on C# and IO: Sequential File Programming Patterns and Performance with .NET I made some minor changes and cleaned up the code a bit.

internal class AsyncFileCopy
    {
        // globals
        private const int Buffers = 8; // number of outstanding requests
        private const int BufferSize = 8*1024*1024; // request size, one megabyte
        public static FileStream Source; // source file stream
        public static FileStream Target; // target file stream
        public static long TotalBytes; // total bytes to process    
        public static long BytesRead; // bytes read so far    
        public static long BytesWritten; // bytes written so far
        public static long Pending; // number of I/O's in flight
        public static Object WriteCountMutex = new Object[0]; // mutex to protect count
        // Array of buffers and async results.  
        public static AsyncRequestState[] Request = new AsyncRequestState[Buffers];

        public static void AsyncBufferedFileCopy(string inputfile, string outputfile)
        {
            Source = new FileStream(inputfile, // open source file
                                    FileMode.Open, // for read
                                    FileAccess.Read, //
                                    FileShare.Read, // allow other readers
                                    BufferSize, // buffer size
                                    FileOptions.Asynchronous); // use async
            Target = new FileStream(outputfile, // create target file
                                    FileMode.Create, // fault if it exists
                                    FileAccess.Write, // will write the file
                                    FileShare.None, // exclusive access
                                    BufferSize, // buffer size
                                    FileOptions.Asynchronous); //unbuffered async
            TotalBytes = Source.Length; // Size of source file
            Target.SetLength(TotalBytes); //Set target file lenght to avoid file growth
            var writeCompleteCallback = new AsyncCallback(WriteCompleteCallback);
            for (int i = 0; i < Buffers; i++) Request[i] = new AsyncRequestState(i);
            // launch initial async reads
            for (int i = 0; i < Buffers; i++)
            {
                // no callback on reads.                     
                Request[i].ReadAsyncResult = Source.BeginRead(Request[i].Buffer, 0, BufferSize, null, i);
                Request[i].ReadLaunched.Set(); // say that read is launched
            }
            // wait for the reads to complete in order, process buffer and then write it. 
            for (int i = 0; (BytesRead < TotalBytes); i = (i + 1)%Buffers)
            {
                Request[i].ReadLaunched.WaitOne(); // wait for flag that says buffer is reading
                int bytes = Source.EndRead(Request[i].ReadAsyncResult); // wait for read complete
                BytesRead += bytes; // process the buffer <your code goes here>
                Target.BeginWrite(Request[i].Buffer, 0, bytes, writeCompleteCallback, i); // write it
            } // end of reader loop
            while (Pending > 0) Thread.Sleep(10); // wait for all the writes to complete                 
            Source.Close();
            Target.Close(); // close the files                     
        }

        // structure to hold IO request buffer and result.

        // end AsyncRequestState declaration
        // Asynchronous Callback completes writes and issues next read
        public static void WriteCompleteCallback(IAsyncResult ar)
        {
            lock (WriteCountMutex)
            {
                // protect the shared variables
                int i = Convert.ToInt32(ar.AsyncState); // get request index
                Target.EndWrite(ar); // mark the write complete
                BytesWritten += BufferSize; // advance bytes written
                Request[i].BufferOffset += Buffers*BufferSize; // stride to next slot 
                if (Request[i].BufferOffset < TotalBytes)
                {
                    // if not all read, issue next read
                    Source.Position = Request[i].BufferOffset; // issue read at that offset
                    Request[i].ReadAsyncResult = Source.BeginRead(Request[i].Buffer, 0, BufferSize, null, i);
                    Request[i].ReadLaunched.Set();
                }
            }
        }

        #region Nested type: AsyncRequestState

        public class AsyncRequestState
        {
            // data that tracks each async request
            public byte[] Buffer; // IO buffer to hold read/write data
            public long BufferOffset; // buffer strides thru file BUFFERS*BUFFER_SIZE
            public IAsyncResult ReadAsyncResult; // handle for read requests to EndRead() on.
            public AutoResetEvent ReadLaunched; // Event signals start of read 

            public AsyncRequestState(int i)
            {
                // constructor    
                BufferOffset = i*BufferSize; // offset in file where buffer reads/writes
                ReadLaunched = new AutoResetEvent(false); // semaphore says reading (not writing)
                Buffer = new byte[BufferSize]; // allocates the buffer
            }
        }

        #endregion
    }

The Fun bit about this code is you don’t need to spawn your own threads to do the work. All of this happens from a single thread call and the async happens in the background. I do make sure and grow the file to prevent dropping back into synchronous mode on file growths.

This next bit is the un-buffered stuff.

internal class UnBufferedFileCopy
{
    public static int CopyBufferSize = 8 * 1024 * 1024;

    public static byte[] Buffer = new byte[CopyBufferSize];

    const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

    public static int CopyFileUnbuffered(string inputfile, string outputfile)
    {
        var infile = new FileStream(inputfile,
                                    FileMode.Open, FileAccess.Read, FileShare.None, 8
, FileFlagNoBuffering | FileOptions.SequentialScan);
        var outfile = new FileStream(outputfile, FileMode.Create, FileAccess.Write,
                                     FileShare.None, 8, FileOptions.WriteThrough);

        int bytesRead;
        while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0)
        {
            outfile.Write(Buffer, 0, bytesRead);
        }

        outfile.Close();
        outfile.Dispose();
        infile.Close();
        infile.Dispose();
        return 1;
    }
}

Since this is a synchronous call I’m not worried about extending the file for performance. There is the fragmentation issue to worry about. Without that the code is a bit cleaner. The secret sauce on this one is creating your own file option and passing it in.

const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

I hear you asking now, where did this thing come from? Well, that is simple it is a regular flag you can pass in if you are doing things in C or C++ when you create a file handle. I got curious as to what the CLR was actually doing in the background. It has to make a call to the OS at some point and that means unmanaged code.

internal class UnmanagedFileCopy
{
    public static int CopyBufferSize = 8 * 1024 * 1024;

    public static byte[] Buffer = new byte[CopyBufferSize];

    private const int FILE_FLAG_NO_BUFFERING = unchecked(0x20000000);
    private const int FILE_FLAG_OVERLAPPED = unchecked(0x40000000);
    private const int FILE_FLAG_SEQUENTIAL_SCAN = unchecked(0x08000000);
    private const int FILE_FLAG_WRITE_THROUGH = unchecked((int)0x80000000);
    private const int FILE_FLAG_NONE = unchecked(0x00000000);

    public static FileStream infile;
    public static SafeFileHandle inhandle;
    public static FileStream outfile;
    public static SafeFileHandle outhandle;

    [DllImport("KERNEL32", SetLastError = true, CharSet = CharSet.Auto, BestFitMapping = false)]
    private static extern SafeFileHandle CreateFile(String fileName,
                                                    int desiredAccess,
                                                    FileShare shareMode,
                                                    IntPtr securityAttrs,
                                                    FileMode creationDisposition,
                                                    int flagsAndAttributes,
                                                    IntPtr templateFile);

    public static void CopyUnmanaged(string inputfile, string outputfile)
    {
        outhandle = CreateFile(outputfile,
                   (int)FileAccess.Write,
                   (int)FileShare.None,
                   IntPtr.Zero,
                   FileMode.Create,
                   FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH,
                   IntPtr.Zero);

        inhandle = CreateFile(inputfile,
                                  (int)FileAccess.Read,
                                  (int)FileShare.None,
                                  IntPtr.Zero,
                                  FileMode.Open,
                                  FILE_FLAG_NO_BUFFERING | FILE_FLAG_SEQUENTIAL_SCAN,
                                  IntPtr.Zero);

        outfile = new FileStream(outhandle, FileAccess.Write, 8, false);
        infile = new FileStream(inhandle, FileAccess.Read, 8, false);

        int bytesRead;
        while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0)
        {
            outfile.Write(Buffer, 0, bytesRead);
        }

        outfile.Close();
        outfile.Dispose();
        outhandle.Close();
        outhandle.Dispose();
        infile.Close();
        infile.Dispose();
        inhandle.Close();
        inhandle.Dispose();
    }
}

If I was building my own unmanaged calls this would be it. When you profile the managed code for object creates/destroys you see that it is making calls to SafeFileHandle. Being the curious guy I am I did a little more digging. For those of you who don’t know there is an open source implementation of the Common Language Runtime called Mono. That means you can download the source code and take a look at how things are done. Poking around in the FileStream and associated code I saw that had all the file flags in the code but commented out un-buffered… Now I had a mystery on my hands. I tried to implement asynchronous un-buffered IO using all unmanaged code calls and couldn’t do it. There is a fundamental difference between a byte array in the CLR and what I can setup in native C++. One of the things you have to be able to do if you want asynchronous un-buffered IO is to sector align all reads and writes, including in and out of memory buffers. You can’t do it in C#. You have to allocate an unmanaged segment of memory and handle the reads and writes through that buffer. At the end of the day, you have written all the C++ you need to do the file copy stuff and rapped it in a managed code loop.

So, you can do asynchronous OR un-buffered but not both. From Sequential File Programming Patterns and Performance with .NET

the FileStream class does a fine job. Most applications do not need or want un-buffered IO. But, some applications like database systems and file copy utilities want the performance and control un-buffered IO offers.

And that is a real shame, I’d love to write some high performance IO stuff in C#. I settled on doing un-buffered IO since these copies are from a SQL Server which will always be under some kind of memory pressure, to the file server. If I could do both asynchronous and un-buffered I could get close to wire speed, around 105 to 115 megabytes a second. Just doing un-buffered gets me around 80 megabytes per second. Not horrible, but not the best.

How do you change something that can’t be changed?

As many of you know, Steve Jones was cut by the Nomination Committee and won’t be on the ballot this year. He made the announcement in a short blog post.

In the comments Andy Warren said “I’m disappointed as well. I think the NomCom tried hard to apply the process we gave them, so the fault – in my personal view – is with the process, just not rich enough. I’ll write more when I’ve had a chance to reflect some.

I would be disappointed – again, speaking just as me – if anyone elects to walk away from PASS because of this. Don’t agree with PASS on this or any other issue? Fight the fight to change it!”

I agree, if you walk away just because of this issue that would be a mistake. If this is last in a long list of mistakes that is a different matter all together.

From the Bylaws:
”These Bylaws may be altered, amended, or repealed, and new bylaws may be adopted by a two‐thirds vote of the Board of Directors, provided that at least thirty (30) days’ written notice is given of intention to alter, amend, or repeal these Bylaws and to adopt new Bylaws prior to the specified date of the vote.”

Thirty days, that’s all it takes to make a fundamental change to how PASS operates. Oh, and a two-thirds vote from the BoD. So, my question is how do I change something about PASS without running for the BoD? Did you know the qualifications for the BoD haven’t changed once since 1999? You have to be more qualified to run for the BoD than you do the President Of The United States. People govern our society with less qualifications and more success.

Not wanting to walk away from an organization I have been involved with for several years I choose to fight for change. My first little fight was attend meetings at the Summit voice my opinion and push for change. Next, to vote for people like Andy Warren and Tom LaRock. I think both are agents for change. Andy has been very vocal and has been as transparent as possible during his tenor on the BoD. I can’t say that for everyone else. The anemic amount of information coming from PASS and the BoD in general is just embarrassing. You don’t see much unless there is a firestorm going on. I also got evolved in the program committee this year. Again, I was a little surprised by how little information is fed back to submitters and the general lack of transparency in choosing speakers. I couldn’t run for the BoD this year but I was planning to run next year and getting my ducks in a row for that. Lastly, I talk a lot. I talk to leaders in the community, people involved in PASS and folks on the board.

I’ve also spent a lot of my time at the local level running the Austin, TX UG for several years and starting the San Antonio, TX UG. I speak regularly at my UG and more recently, at SQL Saturday events. Local community building is a passion for me. When Andy, Steve and Brian got SQL Saturday going I was just in awe how quickly it took off and how well local chapters handled these events. The thing that blew me away was the cost to the attendees. The quality of the training is on par with the Summit in most cases. In some cases it can be a little better as Baton Rouge showed me that you can include the .net folks as well and still have an awesome SQL Server event. We are also in the planning stages for our own SQL Saturday in Austin. Have I put in as much work as say Patrick LeBlanc? Maybe not yet but I have put in more than your “average” PASS member.

With all of that said, I can can tell you honestly, I don’t know most of the BoD very well at all. When people run for the BoD there is a good chance I probably haven’t heard of some them and have to dig to find out who they are. The BoD is elected by “Members in good standing”, a group that probably doesn’t know them ether. If you say there are around 200 chapters* and at least one chapter leader per group, some have committees and such but we are keeping this simple, that gives us 200~ or so “Leaders in the community” I bet most of them don’t know the whole BoD. 51%* of UG leaders knew who their regional mentor was, you know the person designated to liaison between chapters and the parent organization. Almost half of the UG’s have no real interaction with PASS. So how do these folks get voted in? They may be a known speaker or did some campaigning to raise awareness, honestly I think it comes down to the limited number of “qualified” candidates and the number of slots available. A coin toss in most cases. The few people that may be truly informed, say around 100, may have some influence as well. That is out of the roughly 40,000 possible PASS membership (if each UG averages 200 members). Fundamentally, 15 people decide what PASS is, not the 40,000 they represent.

Lets be honest, PASS isn’t a community organization, it is the Summit.  Everything that PASS does is to get people to go to the Summit. I don’t see much in the way of UG support, other than the “comp” to the Summit. PASS is a marketing machine period. They use the UG to funnel people to the summit and “give back” by giving the UG leadership one summit pass, and they are looking to restrict that*. Douglas McDowell said* “There is always some misunderstanding that this is not an easy or free benefit for PASS to offer, the actual cost for a Summit comp is high since all the event facilities and food and beverage are all charged per-attendee plus incremental consumption – it adds up quick and requires a lot of allocated budget.” It is a marketing expense, and a fair one at that. How many people do the UG’s reach out to? I spend quite a bit of time cheerleading trying to get people to go to the Summit. It takes a lot of money to do the Summit. At the end of the day it touches around 3,000 people. Less than one tenth of the voting PASS membership. What has PASS done to reach out to the other 37,000? We got SQL Saturday! Oh, that was built by people outside PASS, then handed to PASS. How many people has SQL Saturday help train? With 40+ events done if each event had 200 people show that is 8,800 people, since 2007 no less. We got the UG’s! Again, local people do all the leg work they find their own funding, speakers and meeting space. The 24 hours of PASS is the only thing of true value outside the Summit that has come from inside PASS. PASS, as a parent organization isn’t relevant to 90% of my UG membership, we could be a chapter of the local funeral directors association for all they care. After 6 years I’ve all I have received are slide decks, comps and enough money to run the UG for about 6 months. In return I spend 11 months out of the year getting as many people as I can to attend the Summit.

Is the BoD all setting around like Snidely Whiplash and twirling their collective mustaches? I don’t think so. Did the NomCom receive secret orders to stand in the way of people like Tim, Steve and Brent? Nope, I’m sure they didn’t. Instead they have built up a system that feeds itself. The machine is self sustaining and there aren’t enough people at this point to make the changes at the top that need to happen. The focus is so tight on the Summit that everything else is just sparklers and window dressing. I say kill the Summit, focus on the local and regional events. You won’t need a three million dollar budget to reach out to the vast majority of the membership. The community will benefit as a whole. Have open elections, not just for the BoD but other key jobs inside the organization. Open it up, all the way. I have worked with other non-profits before, everything was open to the public, there were no closed BoD sessions or hiding from the community we supported, and I tell you that community needed a hell of a lot more than training, we were effecting peoples lives.

I’m begging you, the BoD, to fix this. Not just the duly elected members but the members from Microsoft and CA as well. As for my fellow chapter leaders, speakers and event organizers SPEAK UP! We are still a small group inside the electorate but we represent a much wider range of it than the BoD. Lastly, you the person who comes to the UG’s, attends the Summit or goes to a SQL Saturday let as many people know that you would like to see change, you are providing something in exchange for the education whether you pay the 1,500 bucks for the Summit or show up for free to the other events. You have the ultimate power as a consumer, the all mighty dollar. Demand more for your investment.

My issues with PASS as an organization didn’t start today. It has finally come to a boil though. I’m not exactly sure what I’m going to do in the future with PASS. I plan on staying involved with SQL Saturday events and building local community as much as possible. As for my ability to fight for change, I think I’ve reached the end of that road. There were UG’s before PASS and there will be UG’s after PASS has faded away. I for one look forward to joining ASSP, the Association of SQL Server Professionals, if they understand what caused PASS to loose the backing of the SQL Server community in the first place.

*(taken from http://www.sqlpass.org/Community/PASSBlog/entryid/156/Q2-Chapter-Survey-Shows-Strong-Community-Reach.aspx)

The Dangers of Sub-queries!!!!

Ok, now that I have your attention this really should be titled the danger of not qualifying all objects, but that isn’t quite as sensational enough to make you click through to here 🙂

Imagine if you will, a developer expects ten records to be inserted into a table and the query has been running for almost an hour.  A quick glance and sp_who2 shows that it is using a lot of IO, I mean a lot. The first thing that happens is a command decision from the boss to kill the query. I took a quick peek and found the plan still in the plan cache while we waited for this thing to roll back. As I look at the plan I see something really wrong. It looks like the table receiving the inserts isn’t getting the ten records we thought but 169 million records, every record from the table in the select clause.

I look at the query and on the surface everything looks good. When I hit the check mark it compiles without error. Looking at the plan in detail I notice a little icon that tipped me off on where to look next.

image

Oh that’s bad. Warnings: No join predicate. Ouch. But this is a sub-query, how can I have no join predicate? Lets take a look at an example.

DROP TABLE #t1
DROP TABLE #t2
GO
CREATE TABLE #t1 (
  t1id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)
  
ALTER TABLE #t1
 ADD   PRIMARY KEY ( t1id )
 
CREATE TABLE #t2 (
  t2id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)
  
ALTER TABLE #t2
 ADD   PRIMARY KEY ( t2id )
 
INSERT INTO #t1
SELECT 
  spid,
  loginame,
  hostname
FROM   
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

INSERT INTO #t2
SELECT 
  spid,
  loginame,
  hostname
FROM   
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

This gives us two tables with some data. We want to find all the records in #t1 that also exist in #t2 but instead of a join we will use an IN and a sub-query.

SELECT 
  t1name,
  t1place
FROM   
  #t1
WHERE  t1id IN (SELECT 
                 t1id
                FROM   
                 #t2)
 

Looks simple enough and will pass the compile test all day long even though t1id doesn’t exist in #t2. Since you can use columns from the top level query in the sub-query this is a perfectly valid piece of T-SQL.

It gives us this plan when we look at it though.

image

And there you have it, a join without an ON clause as far as the optimizer is concerned. By not qualifying all the columns in the sub-query we opened ourselves up to this error. This isn’t the optimizers fault! If we re-write this as it was mean to be:

SELECT 
  t1name,
  t1place
FROM   
  #t1
WHERE  t1id IN (SELECT 
                 t2id
                FROM   
                 #t2)
 

We get a plan that is more to our liking.

image

You could also re-write this as a join instead of using the IN clause and would have avoided this problem as well. Scary way to learn that SQL Server will do exactly what you tell it to even if you are wrong!