Saturday, November 8, 2008

Java Generics and Clearer Communication

I generally appreciate the extra type checking provided by Java Generics. There is no doubt they improve my code's readability and maintainability while also improving my code's runtime robustness. However, one thing that I have observed is that there can still be some room for improvement in code that makes use of generics. For example, while Map<String,String> does specify the mapping of one String to another String, it is not clear to another developer using this Map what the content of the key and of the value for the given Map should be.

There are a few ways to get around this issue and to improve the ability of other developers to use the Map appropriately. These include:

1. Descriptive Naming: The name of the Map can provide significant insight into what is actually being mapped. For example, the map might be declared as Map<String,String> stateToItsCapitalCityMap. This communicates the fact that this is a map with a state mapped to its capital city. This descriptive naming is generally advisable for any code situation, but it can sometimes be difficult to capture the complete essence of a variable's meaning without an unwieldy name.

2. Domain Objects Rather than Simple Objects/Primitives: In the blog entry Never, Never, Never Use String (or at least less often), an argument is made for use of domain objects instead of primitives and String and other similarly general objects. For example, the Map cited above could instead be specified as Map<State,CapitalCity>. This can be combined with the first listed approach to be very descriptive: Map<State,CapitalCity> stateToItsCapitalCityMap. Besides readability, other advantages associated with this approach include more specific type checking and better ability to encapsulate business logic. As some of the strongly worded responses to Never, Never, Never Use String ... demonstrate, there are many people who find this advice to be dogmatic or unrealistic. For my part, I agree with some of the comments that it largely depends on a number of factors including time constraints, application size, the nature of the problem itself (difficult or easy to comprehend), and so on. I think where it gets hard to justify making a domain object is when it is a single value being represented (such as representing the name string as a full-fledged Name object).

3. Effective External Documentation: Naming variables appropriately seems to be largely considered a best practice these days with much less controversy than it once enjoyed. However, even with appropriately named variables and even with Domain-relevant classes used instead of Strings, primitives, and other general types, there is significant value to using Javadoc to effectively document our Map in question. We see an example of this in the Javadoc for Map itself. This Javadoc tells us that Interface Map<K,V> has "Type Parameters" defined as K - the type of keys maintained by this map and V - the type of mapped values. As a colleague of mine (Ross Delaplane) has demonstrated several times in his own code, we can follow this example. For example, I could document my Map in Javadoc as shown next:

* <p>Map of a Person's Name to that Person.</p>
* <p>This Map contains key/value pairs {@code <K,V>} where
* {@code K} is the Name of the Person.
* {@code V} is the Person associated with the the key Name.</p>
Map<Name,Person> nameToItsPersonMap;

The {@code } tag notation is useful for rendering the text inside it as <code> font (fixed-width font) and for instructing the parser to ignore characters such as < and > that occur inside of there. The {@literal } tag notation is similar to {@code } without the fixed-width, code-oriented font. The ability of both {@code } and {@literal } to allow < and > to occur in the documentation without being parsed in especially useful when documenting generic use.

More General than Generics

While this blog entry has focused on making collections and other data structures using generics even more descriptive to potential users, these considerations are not limited to generics only. In fact, most of these considerations have been previously applied to writing cleaner and more maintainable Java code in general.

As discussed above, the idea of using descriptive variable names and descriptive parameter names has become one of those generally accepted principles of software development.

In the item in Effective Java on designing method signatures carefully, Josh Bloch points out some of the advantages of combining primitive types and Strings into domain objects that encapsulate these values and make the API easier to use. Bloch especially warns against using long lists of the same type in a method signature because that is very likely to have the arguments passed into the method in the wrong order and thus be used incorrectly.

Even though I agree that there is such a thing as too much commenting, I have found that some commenting is essential. This is especially true of Javadoc-generated comments. I have heard some developers state that there is no need for Javadoc because other developers can simply look at the code for the authoritative answers. This argument hinges on the assumption that the code uses descriptive names and is self-documenting. However, in my experience on very large programs, I find that many of the people that perform requirements collection, testing, architecting, and even managing the project prefer Javadoc-generated HTML web pages over looking at the code directly. For these people (and even for me sometimes), the Javadoc comments are important. For example, I relatively rarely look at JDK source code as compared to looking at JDK API documentation. It is generally of very little use to document simple get and set methods, but is often of great use to document more significant methods for these users. In Effective Java, Joshua Block points out the importance of documenting all publicly exposed elements in an API.

Nothing in this blog entry is particularly new, but I have seen that much of it is also not particularly well-followed either. The best developers are those who know what works best for a different situation and use the appropriate approach for that situation rather than trying to stick to rules of thumb that start with "always" or "never." However, in many cases, I think we can do better in communicating the intention of our variables and method parameters, especially when generics are involved. Depending on various factors, we can do this with more descriptive names, with domain objects rather than general objects and primitives, and/or with more descriptive Javadoc comments. In my opinion, descriptive names and descriptive Javadoc comments are almost always (I won't say "always" because I just opined about the dangers of such absolutes) appropriate and using domain objects can often be appropriate.

I began this entry by discussing a few simple ways to better describe the use of collections and data structures that take advantage of generics. I then moved into a more general discussion based on these specific ideas. So, with all the complaints about Java generics, are they even worth using? I personally have found Java generics, despite their sometimes strange wrinkles and warts, to be generally useful and to improve my code. One more reference to Effective Java is appropriate here. The Second Edition of that book states very clearly and plainly: Don't Use Raw Types in New Code.

Generics, like other Java constructs such as methods, parameters, and variables, can be even more powerful when they are named appropriately and descriptively and when their use is appropriately documented. Use of domain-oriented objects rather than general types and primitives can also increase the readability and runtime robustness of these Java constructs.


Stephan.Schmidt said...

Thanks for linking, I'm not that dogmatic usually :-)


Dustin said...


Thanks for the feedback. I appreciated your post back when it came out and agreed with many of your general observations. When I started thinking about ways to better communicate what Maps and other collections of generic types really represent, your post came to mind as one of the most concise and lucid descriptions of the advantages of domain objects.

I also appreciated the feedback comments because I think they helped to flesh out some of the differences of opinion and the advantages and disadvantages of using domain objects. In the end, I lean toward domain objects, but admit that my practical side often leads to me using a single String rather than a domain object consisting of only a single String.

Thanks again for your feedback and especially for your original post that saved me a lot of effort for describing the advantages of using domain objects.