Emperor Ming strikes back

Emperor Ming strikes back

May 5, 2008by Michael Schürigproduct

Joe Celko’s Thinking in Sets

★★★☆☆

I have (read) copies of five earlier of Celko’s books on my shelf, still I am again amazed by the cultural distance. Most of my programming life I have spent with object-oriented programming languages and associated technologies. Thus, when Celko starts the present book with a discussion of the differences between flat files and relational databases, it could hardly be more distant than if he had extolled the virtues of the gasoline engine over its steam predecessor.

Celko likes to refer to his informers as “Mr. So-and-so, working for company X” this again moves the cultural differences to the front, and I can’t avoid a slight chuckle when he reverently cites “Dr. E.F. Codd” for the umpteenth time. It all decidedly feels like a tale from an imaginary 1950s. I certainly envision people in lab coats.

The tone moves from enjoyably quaint to annoying, when Celko (again and again) ridicules the many failings of database novices and sophomores. He might not realize that those who share in the joke have no need to read his book — and that those who bought the book to learn something from it may feel a wee bit offended. After all, we are already aware that there’s something we don’t know yet and want to learn, there’s really no need to rub it in.

So much for the atmospheric stuff. But, of course, I didn’t buy this book to make me feel good, but to learn something, come rain or shine. And, yes, there is a lot useful stuff in this book. More in the bits and pieces than in some generalized approach. And by far more in line with the subtitle, “Auxiliary, Temporal and Virtual Tables in SQL” than with “Thinking in Sets”, the main title. Regarding the latter, I found the most worthwhile part of the book to be the discussion of why boolean flags are bad (ch. 11, Thinking in SQL).

Celko’s effort to distance the relational, set-based approach from earlier practices crops up all over the book. I had expected — and hoped! — that Celko would put considerably energy into comparing, contrasting, and hopefully complementing set-based thinking with current object-oriented approaches. Alas, he’s completely preoccupied with his own tradition and doesn’t wander into OO-land at all.

I would have been very interested in reading a knowledgable discussion of where to draw the line between procedural and set-based approaches. And, as most practical programs will employ both of these approaches, how to interface the respective parts. On the latter issue, there’s not a single word in this book. The treatment of the former issue is interesting, in a twisted sense. Celko demonstrates some string processing in SQL and concedes that this would be much easier in languages such as ICON or SNOBOL, those stalwarts of 1970s era dataprocessing (does he even know Perl?). Well, why then try to abuse SQL to do something for which it is ill-suited and results in bloated code? Why anyone would want to solve Sudoku puzzles in SQL I cannot fathom, either. Celko doesn’t tell, and neither does he present the whole (repetitive) code, nor explain how the set-based approach works in any sufficient detail.

The overarching mindset exemplified in this book is to push as much into the database as possible, even if it hurts at times. I don’t mean to denigrade the intention, namely application-independent, consistent data storage. However, the reality in current software engineering is that a shared database is but one solution among others. For instance, SOA (Service Oriented Architecture) is specifically about connecting applications through services they provide, not by tying them to a shared database.

Celko likes to style himself in the image of Ming the Merciless. The semblance is indeed uncanny and as I hinted already, he tries to live up to the role as his author persona. Unfortunately, he doesn’t seem to realize that there’s one thing that can’t be tolerated in an arch-villain (as well as in his henchmen and henchwomen): sloppiness. The book has more than its fair share of typos and grammatical accidents. A particularly amusing case in point — due to his belligerent character, a deeper insight, or simply search-and-replace gone awry — is an example that consistently refers to “martial status”.