Wednesday, January 23, 2008

SQL Tricks/Patterns 0

It's been a while since my last post... These last couple of months were full of events for me. The biggest one was the birth of my second child. It changed our routines quite a lot... Right now managing to sleep more than a couple of hours with a baby and a small child is our biggest achievement. But it's full of rewards :)

Getting back to technology, and looking at my last post and some functional programming posts around C#, I came to realize the similarities between SQL and FP. They're both strongly mathematically based, declarative... And both require a change of mindset (I'm still on a very rudimentary level on FP), from what we're used to in imperative programming.

So the first and most important pattern in SQL, and that is the base for many other patterns and the correct usage, is that SQL is set based, and all row by row processing should be avoided (by principle). Let's be a little more pragmatic, and if it's a quick and dirty solution, for a small problem set, a one shot solution never to be used again, it's ok. But if it's a migration, an operation script, a script for the client, then be very careful... It will be used on increasing larger sets, multiple times and with a diminishing time window.

Only being through a data migration in which a test takes a couple of days, that must be executed multiple times (because of bad data, or bad mappings, or bugs), or having to put up with a client because some job is increasingly taking longer and risking it's time window, or even witnessing data intensive operations being done (excruciatingly slow) on the application client, one understands the problem.

Ok. But to a freshmen/rookie/intern, and unless told otherwise, the imperative solution would be the only solution. And imperative has conditionals, and many other goodies...

  • In SQL, we can simulate conditionals with CASE, for example, updating the salary by 1.1 or 1.2 based on the category of the employee:

UPDATE Employees
SET salary = salary *
CASE categoryId
WHEN 1 THEN 1.1
WHEN 2 THEN 1.2
END

  • But what if it's impossible to do in one query, or better yet to avoid turning the query overly complex?
    Then we use temporary tables to hold the intermediate results. But each step of the processing is done on the set and not on a row by row basis. If some step is impossible, or too hard to implement using set operations, then only that step is done row by row, but the remaining steps are set based.
    If some task is complex (reading, parsing, validating and loading a file to the DB) or the business allows for it, it can even be done on table using some sort of state identifier (step 1 is reading, or aproving...).

  • But what if I have to insert into other tables? Well, insert is also set based so instead of INSERT (...) VALUES (...), use INSERT (...) SELECT ...
    There's another problem, to which I don't have a satisfactory answer. That's when you have a master/detail relationship and the master uses Identity values. In that case, apart from switching the identity property on and off, and defining the max value, it's difficult to insert a batch of these relations (using set based processing). Other possibilities are using triggers or in SQL 2005, the output clause, but they all seem awkward and fragile.

  • Another problem is generating a number or some kind of order in the set. To that, the numbers table from my last post is the answer. There are some variations and simplifications. If all you need is a sequential number for the data (to order it, or to differentiate it - very good to remove complete duplicates), a temporary table with an identity column might suffice.This is also the principle used in most solutions that paginate result sets (joining the table with itself using some computed number and filtering it).

Well, I must be forgetting a lot more, but as it's already too long, I'll end it here.

Until next time...

3 comments:

Unknown said...

louis vuitton confirms the comeback of the Stephen Sprouse’s Graffiti Collection along with another louis vuitton bags, the Rose as Marc Jacobs and Louis Vuitton pays homage to Sprouse who passed away last 2004.louis vuitton handbags was first introduced in 2001 and was sold like hotcakes, many have missed out on it and now that they’re back, it’s worth rejoicing. In this new collection, lv “Jacobs took two iconic Sprouse motifs — the graffiti and the rose — and interpreted them in Day-Glo shades of pink, green and orange over the Monogram print. The replica louis vuitton handbags are featured on Vuitton’s Keepall, Speedy and Neverfull bag styles, as well as basketball sneaker boots, pumps, sunglasses, headbands and wristbands, and small leather goods like wallets and coin purses. The rtw includes a mackintosh raincoat with a graffiti and monogram lining, graffiti leggings and a long-sleeve neon minidress featuring the rose design.” lv handbag states that the collection will not be called reissues as he had done things that are similar or new which differs the “new collection” from Stephen’s Graffiti from 2001.

Unknown said...

daidaihua
detoxify your physique thereby expelling all the adverse toxins in your body. Among a huge ambit of lida slimming
present in the market. slimming capsule
makes weight accident simple and quick. Though there may be astronomic companies introducing slimming capsules
. This is absolutely what lida
abridged do. Also there is no charge to cut down on aliment while you are demography lida daidaihua
regularly.

Unknown said...

Not alone do Chanel bag
attending abundant. Every anchorman Chanel Wallet
is issued its own hologram approval amount. This is the way to ensure that the Chanel handbags
that you accept purchased.




Always be acquainted of replica Chanel handbags
, and analysis that the Chanel handbag
has a attention card. The identification amount on the agenda is reproduced anxiously central the Chanel
.