Friday, August 01, 2008

SQL set approach

SQL is a powerful tool that can be a great help for solving some problems. You just have to think in a set oriented approach (Thinking in Sets by Joe Celko comes to mind). I just hope thinking this way doesn't transform me into some uptight person that thinks less of everyone else, just because they don't understand it initially. And this approach must not be taken as a dogma, or an end in itself. If a solution doesn't appear in some reasonable time, I resort to iterative approaches. Better not even get into primary keys discussion and business identifiers... (end of rant)

 

Today, during the implementation of an algorithm for shift atribution, the client asked for the implementation of some exceptions when there wasn't anyone available in the specified region. After some discussion, the algorithm should cope with this situation by looking up in the neighbour regions, then in a larger area, and finally the whole country. As the algorithm isn't just a select (it has more rules), I didn't want to replicate the whole algorithm for each universe or apply the algorithm iteratively for each region (that would also distort the ordering and equity of the algorithm).

 

After some brainstorming, the problem was how to change the universe in which to look by joining to the region table. And as I was to give up, the solution became evident: what was needed was filtering the universe. But how if the table doesn't have in itself all the attributes to determine the regions?

One way would be to compute the regions with a function and using a dynamic query with an in filter. Something like:

regionId   IN  
    dbo.fnComputeRegions(regionId, coverage)

But that forced us to use dynamic querys and the in could become quite big (there are a couple hundred regions).

Then, a better solution came to my mind: the problem was in the join, so the query should join with a table valued function that returns a table with the regions to lookup. This solution didn't force the use of dynamic queries and allowed us to reuse the whole algorithm just changing the join, surrounding the algorithm with a while that changes the coverage:

while (coverageId < numberOf Coverages)

     (algorithm)

     select ...

     from

        Region R

        inner join dbo.fnComputeRegions(regionId,
              coverageId) CR on R.regionId = CR.regionId

 

With this approach we gain abstraction and independence of the computation of regions (it can return a table with one record, the neighbour regions, the neighbour neighbour regions, all the regions within an hierarchy, or all the regions), without changing the algorithm. And what's best is that these computation of regions is reusable in other areas and algorithms.

Now the toughest part is convince the client that this is the best solution and that to these rules there must be no exception :)

Marcas Technorati: ,