Monday, November 23, 2009

Java Collections Default Implementation Choices: Groovy's Opinion

I think many Java developers use the Java Collections Framework like I do: start with the same basic implementations of each of the major collections interfaces for general cases and only use a different implementation when there is a need that drives it (such as need for NavigableMap). For example, for a long time, I used ArrayList for Lists, HashSet for Sets, and HashMap for Maps. I have started to heavily use EnumSet and EnumMap for appropriate Set and Map situations respectively, but I still seem to have my same favorite "default" collections implementations.

Ruby on Rails in large part made "convention over configuration" popular and the Java Peristence API (JPA) further popularized this concept in the Java world under the name "configuration by exception." Of course, many other frameworks and libraries have similarly used and popularized the concept of reducing the need for explicit configuration when conventions and defaults are good enough. One of the appealing facets of Groovy is its support for certain collection types by default without explicit specification and only assumed from the syntax used. In this post, I look at the default implementations that the Groovy developers selected for the Groovy language to use when no specific implementation is specified.

Most introductory Groovy materials demonstrate (and often applaud) the conciseness and intuitiveness of Groovy's collections specification syntax. Much of this is a result of Groovy assuming which implementation of each collection will be used. Groovy does allow this default choice to be overridden by explicit specification of the desired collection implementation. The following code snippet demonstrates Groovy's default implementations for Lists, Sets, and Maps.


list = ["Denver", "Chicago", "Tallahassee", "Detroit", "Salt Lake City", "Las Vegas", "Atlanta"]
println "\nLIST: Default List implementation is ${list.getClass().name}\n"

map = ["Colorado" : "Denver", "Florida" : "Tallahassee", "Michigan" : "Detroit"]
println "MAP: Default Map implementation is ${map.getClass().name}\n"

set1 = list as Set
println "SET: Default Set implementation is ${set1.getClass().name}\n"

Set set2 = ["Colorado", "Florida", "Michigan", "Utah", "Nevada", "Georgia"]
println "SET: Default Set implementation is ${set2.getClass().name}\n"

queue = list as Queue
println "QUEUE: Default Queue implementation is ${queue.getClass().name}\n"


When the above script is run (as demonstrated in the next screen snapshot), we learn that the ArrayList is Groovy's default implementation of the List interface, that HashSet is Groovy's default implementation of the Set interface, LinkedHashMap is Groovy's default implementation of choice for the Map interface, and LinkedList is Groovy's default implementation of the Queue interface.



The authors of the book Java Generics and Collections compare the standardly supplied Java collections implementations in their book. They devote sections of their book (13.3/Sets, 15.3/Lists, 16.6/Maps, 14.5/Queues) to comparing different collections implementations for each interface (List, Set, and Map) and compare them based on comparative performance and functional requirements they satisfy. There are so many implementations and so many considerations facing one over another that it is difficult to choose one to "generally use," but the Groovy selections definitely seem reasonable based on the analysis contained in this book. One obvious exception to this is that the authors are critical (Section 14.4.1) of LinkedList as a Queue (or even Deque) implementation.

Conclusion

Groovy provides concise and highly intuitive syntax for some of its collections implementations, but must make certain assumptions about the particular collection implementation to do this. It is interesting to compare the default implementation choices for Groovy use of collections interfaces to my own default choices. Of course, when in a concurrent environment or facing certain other unique circumstances, the idea of a "general" collection implementation doesn't apply anymore anyway. Certain collections implementations may also be favored for different types of operations on the collection.

No comments: