The philosophical, social and financial constraints on a project will often guide the direction of the system far more than actual technical constraints do. Decision makers will choose to defer perceived complexity and cost till they see a need for it. As technologists our goal is to set ourselves up so that we have the opportunity to implement successive generations of systems when it they needed. This is best done by implementing simple incremental changes than by paying a high cost up front for features and dimensions that are not yet understood.
Once upon a time I worked with a group who was writing an early web system that was to provide a kind of on line version of newspaper want ads. This was being done in partnership with one of the local newspapers who provided the content. Our group of highly trained DBAs and developers were so distracted by the thought that our first offering was to be automobile ads that they spent months modeling things like manufacturers, models, trim levels, features, specs, colors and other such details that they completely lost the fact that the source data was no where near clean enough to be able to ever generate an ad without significant attended loading. Which completely blew the budget. What was worse is that the customer did not understand why our ads did not look at all like the text that was submitted. Neither did they think that our ads looked better than the one in the paper. People buying want ads in the newspaper want their stars and exclamation points. They even want their misspellings and abbreviations left intact. We fell into a trap of trying to apply good design at the wrong time. We eventualy provided an outstanding solution to the wrong problem and the system never saw anything like production.
Ovid's recent blog is another interesting example in this universe of design choices. He presents an interesting story about an existing "version one" system involving pear recipes and uses that to motivate a "version two" system that involves normalization of one factoring of the recipe text. As with all good story telling it leaves out huge swaths of details in order to focus on the point of the tutorial. Which is of course thinking about normalization in RDBMS databases for an operational support system.
I want to defend the customers choice to reject the proposal. I don't want to discount the excellent tutorial content of the blog. I just think that there is another side to the story about the silly customer who rejected the good design. The customer had good reason to reject that design at this phase of he project. Of course the implication seems to be that the choices were to take the new shiny "version two" proposal or stick with the old broken "version one" system.
In the real proposal I'm sure that there were quite a few more nouns involved in the analysis that the few Ovid lists. Still a key feature of all such normalization schema is that they encompass a way of thinking about the data and the kind of questions that we expect to be asked about it. They constrain the kind of content that can be contained. There are also other interface costs. Forms and other interfaces to enter, audit and edit the data. Then there is the brittleness many of these systems have in the face of change. Not to mention that a recipe is a cultural artifact that includes lots of quaint social conventions. For example if you have normalized ingredients into quantities and units do you permit things like dashes, pinches, dolops, etc? Do we present some canonical form of the recipe or attempt to preserve the character of the original source? What is the original source? Are we going to present the customer with a detailed multi-field data entry form? What did your CHI consultant say about that? The list of such questions goes on and on. Sure there are good strategies for dealing with most of these issues but are they needed in the "version two" system?
An alternative is for the "version two" design to go in the semi-structured data direction rather than the normalization of recipes data. It would go in the CMS direction. This would eliminating all the numbered fields from the schema. It would require a bunch of new metadata fields such as modify dates, statistics counters, maybe revision control fields and others. The major role of the back end system will be to manage and maintain the life cycle of the text field that happen to contain recipes. Most of the desired UI features then could be implemented client side and with the help of full text search features in what ever server side content store is chosen.
Ovid's database design mini-tutorial is a valuable introduction to relational modeling concepts. Anyone who has their interest piqued by that should consider reading Joe Celko's SQL for Smarties . Relational modeling is an important tool in any competent programmers tool box. Still I think that for this actual illustration problem we are too easily distracted by the detail of what the text contains to see that what we are being asked for is document management and not modeling of the structure of recipes. Again good design is being introduced at the wrong level.
We need to keep in mind what we are implementing at each stage of the process. We frequently get distracted by shiny details that lead us to spend lots of time working on interesting but unimportant parts of the problem. Sometimes we even find ourselves using our favorite tool even when it is not the best one for the job at hand. The philosophical, social and financial constraints on a project will often guide the direction of the system far more than actual technical constraints dictate.