Category Archives: T-SQL

Changing Directions

I See Dead Tech….

Knowing when a technology is dying is always a good skill to have. Like most of my generation we weren’t the first on the computer scene but lived through several of it’s more painful transitions. As a college student I was forced to learn antiquated technologies and languages. I had to take a semester of COBOL. I also had to take two years of assembler for the IBM 390 mainframe and another year of assembler for the x86 focused on the i386 when the Pentium was already on the market. Again and again I’ve been forced to invest time in dying technologies. Well not any more!

Hard drives are dead LONG LIVE SOLID STATE!

I set the data on a delicate rinse cycle

I’m done with spinning disks. Since IBM invented them in nineteen and fifty seven they haven’t improved much over the years. They got smaller and faster yes but they never got sexier than the original. I mean, my mom was born in the fifties, I don’t want to be associated with something that old and way uncool. Wouldn’t you much rather have something at least invented in the modern age in your state of the art server?

Don’t you want the new hotness?

I mean seriously, isn’t this much cooler? I’m not building any new servers or desktop systems unless they are sporting flash drives. But don’t think this will last. You must stay vigilant, NAND flash won’t age like a fine wine ether. There will be something new in a few years and you must be willing to spend whatever it takes to deploy the “solid state killer” when it comes out.

Tell Gandpa Relational is Soooo last century

The relational model was developed by Dr. EF Codd while at IBM in 1970, two years before I was born. Using some fancy math called tuple calculus he proved that the relational model was better at seeking data on these new “hard drives” that IBM had laying around. That later tuned into relational algebra that is used today. Holy cow! I hated algebra AND calculus in high school why would I want to work with that crap now?

NoSQL Is The Future!

PhD’s, all neck ties and crazy gray hair.

Internet Scale, web 2.0 has a much better haircut.

In this new fast paced world of web 2.0 and databases that have to go all the way to Internet scale, the old crusty relational databases just can’t hang. Enter, NoSQL! I know that NoSQL covers a lot of different technologies, but some of the core things they do very well is scale up to millions of users and I need to scale that high. They do this by side stepping things like relationships, transactions and verified writes to disk. This makes them blazingly fast! Plus, I don’t have to learn any SQL languages, I can stay with what I love best javascript and JSON. Personally, I think MongoDB is the best of the bunch they don’t have a ton of fancy PhD’s, they are getting it done in the real world! Hey, they have a Success Engineer for crying out loud!!! Plus if you are using Ruby, Python, Erlang or any other real Web 2.0 language it just works out of the box. Don’t flame me about your NoSQL solution and why it is better, I just don’t care. I’m gearing up to hit all the major NoSQL conferences this year and canceling all my SQL Server related stuff. So long PASS Summit, no more hanging out with people obsessed with outdated skills.

Head in the CLOUD

Racks and Racks of Spaghetti photo by: Andrew McKaskill

Do you want this to manage?

Or this?

With all that said, I probably won’t be building to many more servers anyway. There is a new way of getting your data and servers without the hassle of buying hardware and securing it, THE CLOUD!

“Cloud computing is computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. Parallels to this concept can be drawn with the electricity grid where end-users consume power resources without any necessary understanding of the component devices in the grid required to provide the service.” http://en.wikipedia.org/wiki/Cloud_computing

Now that’s what I’m talking about! I just plug in my code and out comes money. I don’t need to know how it all works on the back end. I’m all about convenient, on-demand network access to a shared pool of configurable computing resources. You know, kind of like when I was at college and sent my program to a sysadmin to get a time slice on the mainframe. I don’t need to know the details just run my program. Heck, I can even have a private cloud connected to other public and private clouds to make up The Intercloud(tm). Now that is sexy!

To my new ends I will be closing this blog and starting up NoSQLServerNoIOTheCloud.com to document my new jersey, I’ll only be posting once a year though, on April 1st.

See you next year!

The Mystery Of The Slow Last Row

Things Go South

Recently I was troubleshooting a piece of software that archives data out of a very active import table. It is a pretty simple app. It queries the table for any id’s that have data older than X days. It then queries the table again to pull the detail data generating a csv file, table schema script and compresses them for storage. Well, it was running slow. Much slower than expected in fact. It has to process around 5000 id’s and it was taking 50 to 240 seconds an id. This doesn’t sound slow but it adds up in a hurry. I started digging into the code looking at the normal culprits. Network slow? No. Disk IO a problem? No. It must be the compression engine! It takes the CPU to 100% and leaves it there. That metric by itself isn’t a bad thing but I just wasn’t seeing what the problem was. I plugged in other compression engines and tweaked the settings. Oddly enough, every engine performed about the same! I knew there was something I was missing. Stripped things away until I was left with one thing, the query that pulls the detail data from SQL Server.

The Big Boy(TM)

The table in question is big, due to the archive tool slowing down it has ballooned up to 240GB, in a single table, that shouldn’t be more than 10GB or 20GB at any one time. It gets most of the heft not in just rows but the width of the table. I took my sample slow query and started running permutations on it. In the grand scheme of things it wasn’t returning a large row count, around 183,000 or so. The data size was pretty large at 650MB. Still, not anything that would choke the network or our system by any means.

Turn The Knobs

I though it may be that the application was choking on the data size. So, I stepped out of the app and ran the query in SSMS. It ran in the same time that the program did as far as query execution. I put a TOP clause on it cutting the number of rows in half thinking I would get the data back in half the time. I was way off! Instead of the 40 seconds it was now taking around 5 seconds. I pulled the TOP clause off again and watched the data come back.  I noticed it returned data pretty quickly. As a matter of fact, it returned almost all the rows in just a few seconds. It would then spend the next say 30 seconds waiting for the last few rows. At this point I thought something was wrong with my workstation. I  ask a co-worker to run the same two queries, one with the TOP and one without. He got the exact same results I did! I started to panic a little bit. What if something was wrong with the server or there was corruption in the database? I decided to play with the TOP clause getting the count to within a couple of records before things would slow down dramatically. I looked at the crazy simple plan nothing to see here but a scan and return.

 Original Plan

Plan with TOP operator

Digging Deep, Pinging my Tweeps

If you aren’t on twitter you are missing out on a valuable resource. There are members of the SQL Server community that hang out watching #sqlhelp that might just be qualified to offer you an opinion on your problem. MVP’s, MCM’s and beyond! Sometimes they are just really smart people like Paul White (blogtwitter). He may not have any of the magic three letters in his title he is still extremely capable. If you aren’t reading his blog you need to start now. Anyway, I tweeted my dilemma and several people responded. I had already tried everything that people were suggesting Paul offered to take a look at the execution plans and see if he could figure it out. I was pretty sure it had something to do with the data layout on disk and the fact I only had a single index, the clustered key, on the table which I wasn’t searching on. What I expected to get was a full scan every time on the table. When I added the TOP clause SQL Server was smart enough to say “Stop scanning when X records are returned” So, when I returned all 182,853 records it had no choice to to scan the entire table.

I think Paul explains it pretty well:

It seems that the first 182,852 matches on (IDFK = 69468) appear relatively early in the clustered index (scanning in clustered key order).

The 182,853rd record appears to be much later in the clustered index – perhaps quite close to the end of the table, so SQL Server scans a large number of pages after finding the 182,852nd match before it encounters the next one.  If you select the clustered index key plus IDFK from the table, and number the rows using ROW_NUMBER(OVER ORDER BY (SELECT 0)) you should find a big gap in clustered key values between records 182,852 and 182.853.

Note – Which I did find – Wes

When you request the TOP (182852) records, the scan stops as soon as that many records are found – so the scan does not continue across the big gap in clustered index key values to find the 182,853rd value.

Note – I also played with SET ROWCOUNT and had the exact same results. – Wes

Side note.  The query does not request the results in clustered key order, but the storage engine has no other choice as the query is written.  If you add WITH (TABLOCK), WITH (NOLOCK), or otherwise modify the table so that no data changes are possible while the scan is running (perhaps by making the database read-only, or placing the table on a read-only file group) – the storage engine will be able to scan from the IAM pages, which will result in rows coming back in the order pages were allocated to the table.  Depending on the distribution of records with (IDFK = 69468), you might get better, or worse, performance – but the big gap between record #182,852 and #182,853 will almost certainly disappear.  Of course, the storage pattern might result in other such gaps!

Thanks Paul!

I did test the NOLOCK and TABLOCK, both had similar results that sped up the query. Alas, I couldn’t do ether since there is data changes taking place during the archive process. I opted for a covering index on the search key, which automatically includes the primary key and a bookmark lookup. Now it is faster and consistent on speed.

And that is how the mystery of the slow last row was solved!

The Dangers of Sub-queries!!!!

Ok, now that I have your attention this really should be titled the danger of not qualifying all objects, but that isn’t quite as sensational enough to make you click through to here 🙂

Imagine if you will, a developer expects ten records to be inserted into a table and the query has been running for almost an hour.  A quick glance and sp_who2 shows that it is using a lot of IO, I mean a lot. The first thing that happens is a command decision from the boss to kill the query. I took a quick peek and found the plan still in the plan cache while we waited for this thing to roll back. As I look at the plan I see something really wrong. It looks like the table receiving the inserts isn’t getting the ten records we thought but 169 million records, every record from the table in the select clause.

I look at the query and on the surface everything looks good. When I hit the check mark it compiles without error. Looking at the plan in detail I notice a little icon that tipped me off on where to look next.

image

Oh that’s bad. Warnings: No join predicate. Ouch. But this is a sub-query, how can I have no join predicate? Lets take a look at an example.

DROP TABLE #t1
DROP TABLE #t2
GO
CREATE TABLE #t1 (
  t1id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)
  
ALTER TABLE #t1
 ADD   PRIMARY KEY ( t1id )
 
CREATE TABLE #t2 (
  t2id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)
  
ALTER TABLE #t2
 ADD   PRIMARY KEY ( t2id )
 
INSERT INTO #t1
SELECT 
  spid,
  loginame,
  hostname
FROM   
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

INSERT INTO #t2
SELECT 
  spid,
  loginame,
  hostname
FROM   
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

This gives us two tables with some data. We want to find all the records in #t1 that also exist in #t2 but instead of a join we will use an IN and a sub-query.

SELECT 
  t1name,
  t1place
FROM   
  #t1
WHERE  t1id IN (SELECT 
                 t1id
                FROM   
                 #t2)
 

Looks simple enough and will pass the compile test all day long even though t1id doesn’t exist in #t2. Since you can use columns from the top level query in the sub-query this is a perfectly valid piece of T-SQL.

It gives us this plan when we look at it though.

image

And there you have it, a join without an ON clause as far as the optimizer is concerned. By not qualifying all the columns in the sub-query we opened ourselves up to this error. This isn’t the optimizers fault! If we re-write this as it was mean to be:

SELECT 
  t1name,
  t1place
FROM   
  #t1
WHERE  t1id IN (SELECT 
                 t2id
                FROM   
                 #t2)
 

We get a plan that is more to our liking.

image

You could also re-write this as a join instead of using the IN clause and would have avoided this problem as well. Scary way to learn that SQL Server will do exactly what you tell it to even if you are wrong!