Monday, November 19, 2007

There Are Few Absolutes in Software Development

Bill Jackson recently posted a blog entry on the “80-20 rule”. As Bill points out in the blog, he is not talking about the Pareto Principle, but is instead talking about that fact that many “rules,” “guidelines,” “best practices,” and other recommended approaches are really meant for general cases and not for all cases. In fact, there seems to be very few situations in software development in which we can safely use words like “never” and “always.”

Like Bill, I am always assuming the 80-20 rule when recommending a certain practice unless I am covering some of the more interesting cases that occur (the 20%). Whenever I recommend a practice, technique, or approach, the implicit caveat is that “Generally this approach is best.”

To avoid my blog entry becoming just a “yeah, me too!” entry in relation to Bill’s entry, I thought I’d add some common software development examples of the 80-20 rule.

Use of return, break, and continue

I have heard experienced developers state that a method should only have one return statement and that break and continue statements are always bad form. However, I have seen some awful code that is difficult to read and follow because the developer used many convolutions to adhere to the single return statement or to avoid using break and continue. That being said, gratuitous use of multiple return statements, continue statements, and break statements can also have their own readability issues and sometimes reflect potential problems with the code design. The Scala programming language specifically does not provide continue or break.

Coding Standards Enforcement

More than once, I have seen code flagged for not meeting specified coding standards. I generally like the idea of coding standards and think there are significant benefits to adhering to coding standards, especially in large development teams and/or with large code bases. However, this is also one of the best examples of where engineering judgment needs to be used to know which standards to enforce most rigorously.

One common standard is that a method should not exceed X number of lines. There are good reasons for this (including maintainability, modularity, etc.), but blind adherence to this can be more troublesome than it is worth. Questions should be asked like the following
* Is it really so bad if this particular method has X+2 lines of code?
* Is there a way to break this method up that makes more sense? This might apply even to a method that already has fewer than X lines.
* Is it worth making an unnatural break in the method or making the code unnecessarily complex just to keep the method line count under X?

Assuming that X is defined as a reasonable number of lines, I would expect that adhering to the standard of having methods having no more than X number of lines will be a good one. However, there will be cases where a particular method is actually more readable and maintainable if allowed to exceed the standard.

Avoid Magic Numbers

This is generally sound advice, but I have seen even this overdone. It is maddening to see things like the starting 0 in a for loop defined as ZERO. Any time the code uses the number as it is (not as a value representing a concept but as an actual implementation details of the language), that number generally does not need to be defined as a constant. This applies to using zero in for loops, using 1 in single index increment and decrement statements, etc. Numbers that are not tied to the language, such as the temperature at which water boils at sea level or the radius of the earth, are the types of numbers that should be defined as constants.

Related to this is the idea of placing constants in a common file that multiple files can reference. This is a good idea if multiple files actually need to use it, but sometimes it is much easier to have a constant defined in the file that uses it if that is the only file that uses it. The short of all this is that while I generally do prefer to define constants in a common location, there are times when they are best defined in the local class file to which they apply and some numbers should not be made into constants.

Getter and Setter Methods

This controversial topic deserves a blog entry of its own, but I generally see this implemented exactly backwards. Whereas getters and setters should probably be automatically generated (via IDE or manually out of habit) for about 20% of the classes, it seems like they are instead generated for about 80% of newly created classes.

Use of Composition Rather than Inheritance

Many of us learned object-oriented design and development in the 1990s and many of us fell into the trap of using inheritance far more than we should have. It seems that the software development community is reaching consensus on the overuse of inheritance and the preference of composition over inheritance in many situations. Until fairly recently, we seemed to have this one backwards and were using inheritance 80% of the time when we really should have been using inheritance about 20% of the time and composition 80% of the time. I think we’re moving toward that now.

Java Exception Handling

The debate over checked versus unchecked exceptions (Java) has rivaled other techno-debates at its peak, but a majority consensus seems to have largely settled upon the concept of using checked exceptions only when the client code can be expected to do something about the exception and to otherwise use unchecked exceptions. In my experience, that has resulted in using unchecked exceptions far more often than checked exceptions and so this fits the 80-20 rule. It would be wrong, in my and many others’ opinions, to say “always use checked exceptions” or “always use unchecked exceptions.”

I am transitioning here in my examples from examples of rules and guidelines that often apply more often than not, but have their exceptions to examples of successful software development innovations that owe at least some of their success to focusing on the “common 80 percent.” The common theme is that most guidelines, rules, and recommended practices fit the common 80 percent of development situations.

Enhanced Java for-each Loop

The enhanced Java for loop (Java for-each loop) that was introduced with Java 5 provides an example of the 80-20 rule. The generics-enabled enhanced for loop has really grown on me. I enjoy the much tighter code with no need to cast objects in the collections I am iterating over. However, the enhanced for loop does not cover all collection iteration needs and does not cover less standard iterations like reverse iteration. Instead, the enhanced for loop focuses on the simple collection iteration cases. This still works pretty well, though, because the majority of loop iterations do fit the nominal (or simple) case that the enhanced for loop covers. As I think about the loop iterations I have written since the advent of the enhanced for loop, I think I have probably used that syntactic sugar for well over 80% of my collection iterations.

For in-depth coverage of the nuances of the Java for-each loop, see the article Nuances of the Java 5.0 for-each loop.

Ruby on Rails

Like Java’s for-each loop, Ruby on Rails was designed to address the most common cases. Ruby on Rails creator David Heinemeier Hansson has been quoted (in Ruby on Rails: Making Programmers Happy) as stating that Rails has always been meant to cover the common cases that make up roughly 80% of web application development and that developers should expect to deal with the last 20% that includes the highly customized, application-specific type problems. The Ruby on Rails concept of “convention over configuration” and the Java Persistence API’s similar “configuration by exception” concept rely heavily on the 80-20 rule.

An interesting corollary to this 80-20 rule is the significance of the occasional concept that truly “always” applies or truly “never” applies. Because it is very difficult to find anything that is “always” or “never,” we may have to settle for “almost always” or “almost never.” The difficulty of coming up with examples for these absolutes or “almost absolutes” is evidence of the importance of adhering to these recommendations. Here are some ideas for concepts that fit this categorization:

Do Not Rely on Java finalizers

Do Not Rely on Programmatic/Explicit Java Garbage Collection

Do Not Programmatically Manage Threads from Enterprise JavaBeans

Overload a C++ operator with Functionality Congruent with the Overloaded Operator

Even these “almost always” or “almost never” recommendations may have their exceptions, but they are closer to following a 95-5 or 99-1 rule than the 80-20 rule.

All of this really boils down to the common sense admonition: Use guidelines, conventions, standards, and other recommended practices as "rules of thumb" or baselines for making decisions, but be prepared to change an approach based on differences encountered in a specific situation or problem. Just as no single programming language is best for all types of software development, a single guideline can rarely apply to every possible situation. This is a big part of what makes software development interesting -- we must think through the problems and use our experience and skillset to apply the best solution to the problem at hand.

No comments: