Friday, April 12, 2013

Book Review: Effective Unit Testing: A Guide for Java Developers

In the Preface for Effective Unit Testing: A Guide for Java Developers, author Lasse Koskela states that although the impetus for Effective Unit Testing was to "write a Java edition of Roy Osherove's book, The Art of Testing with Examples in .NET," Effective Unit Testing ended up "having very little in common with Roy's book." Koskela further explains in the Preface that "this book is for the Java programmer," but adds that "writing good tests is a language-agnostic problem" and he recommends his book even for developers using languages other than Java.

Koskela does a nice job in the "Preface" of succinctly summarizing what the book is and isn't when he says, "I didn't want to give you a tutorial on JUnit or my favorite mock object library" and "I've tried to minimize the amount of technology-specific advice." This is important to note because Effective Unit Testing is not the book you'll want if you're looking for a book covering intricate details of JUnit, TestNG, Mockito, EasyMock, Hamcrest or other commonly used Java unit testing frameworks and tools. Instead, the focus of Effective Unit Testing: A Guide for Java Developers is on more general concepts of unit testing in general (and in Java in particular) that might be implemented with any variety of different tools. That being stated, Koskela doesn't ignore these tools completely and does sprinkle JUnit code throughout the book along with code samples based on other unit testing tools such as JMock, Mockito, and Hamcrest.

Chapter 1: The Promise of Good Tests

In Chapter 1 of Effective Unit Testing, Koskela mixes brief historical testing anecdotes with basic introductory material and often-cited reasons on why unit tests and automated tests are important (identifying bugs, improving design, avoiding scope creep, and learning from the test-writing experience). Koskela talks about why units tests are more effective when used for design in addition to being used for quality assurance.

The "Factors of Productivity" section of Chapter 1 is useful for understanding certain measures by which one might determine whether unit tests are "effective." These include execution speed (performance), readability, reliability, and trustworthiness. Achieving these characteristics of effective unit tests is the focus of the book. A couple concluding sections of the first chapter focus on using tests for design and employing behavior-driven development (BDD).

Among other introductory details, he articulates why "100% code coverage isn't the goal." Several well-known testing-related terms ["test-infected", "test-driven development" (TDD), and "accidental complexity"] are also introduced in this chapter along with references for additional details. In this first chapter, the author also introduces his "The Law of Two Plateaus" to differentiate between using unit tests solely for quality assurance versus using them for design in addition to quality assurance.

Chapter 1 is mostly introductory and probably doesn't hold a lot of new insights for someone who has worked with Java extensively and/or has written unit tests extensively. However, it does manage in 12 pages to meet the author's goal for it and the other two chapters (Chapter 2 and Chapter 3) of Part 1 of providing a "shared context" for the remainder of the book.

Chapter 2: In Search of Good

Chapter 2 delves deeper into the question of "What makes a test 'good'?" Koskela is quick to point out that there is some subjectiveness to this ("some of the quality of test code is in the eye of the beholder" as is the case for any code) and that different contexts can affect whether a particular unit test is good or not.

Koskela discusses in this chapter how virtues of regular source code are often valued virtues of test code. For example, he discusses that readable test code is maintainable test code and that appropriate structure plays a big role in making tests understandable. Koskela devotes a section of Chapter 2 to a smaller but significant issue I've seen repeatedly in writing and maintaining my own and others' unit tests: unit test methods that advertise testing something they don't really test (perhaps because they are named poorly) can be very costly. I like what Koskela titles that section, "It's not good if it's testing the wrong things." This sounds obvious, but there is deep truth to that simple statement.

Another section of the second chapter focuses on the principle that, for good tests, "independent tests run easily in solitude." Koskela provides a list of dependencies (such as "randomness" and "persistence") that to him are code smells indicating that something might be wrong with the unit test code. I like his "litmus test for a project's test infrastructure" to satisfy the following scenario: "Can I check out a fresh copy from version control to a brand new computer I just unboxed, run a single command, lean back, and watch a full suite of automated tests run and pass?" This section does a nice job of covering why it's important that tests are independent and do not rely on being called in a certain order. In addition to pointing out some unit test dependency smells in this section, Kosekla also provides some specific approaches that might be taken to address these.

One of the sections of the second chapter of Effective Unit Testing looks at why testing the wrong thing or even testing nothing at all ("happy tests") are problematic. There is coverage of why tests "need to be repeatable" along with references to Java-specific examples of tests that introduce things outside of the testing developer's control into the tests.

The last section of Chapter 2 (not counting the "Summary") introduces test doubles, the subject of Chapter 3. Koskela defines test doubles as "an umbrella term for ... stubs, fakes, or mocks." He adds that "test doubles" are "objects that you substitute for the real implementation for testing purposes." Koskela groups test doubles with testing frameworks and build tools as his "top three tools of the trade for software developers writing automated tests."

Chapter 3: Test Doubles

The third chapter is the concluding chapter and my favorite chapter of Part 1 ("Foundations"). The chapter is devoted to coverage of "test doubles", a term and concept introduced in Gerard Meszaros's xUnit Test Patterns: Refactoring Test Code. Koskela outlines five reasons developers might use test doubles, including "the most fundamental of the reasons for employing a test double - to isolate the code you want to test from its surroundings." After listing these five reasons for use of test doubles, Koskela describes each of these motivating reasons in greater detail. He then describes each of the types of test doubles and compares their strengths and weaknesses. Koskela's section "Guidelines for Using Test Doubles" introduces his "logic and heuristics" for "picking the [test double] option that results in the most readable test". These include five considerations plus the simplifying rule: "stub queries; mock actions" (attributed to J.B. Rainsberger, author of JUnit Recipes).

The third chapter of Effective Unit Testing also touches on organizing unit tests with the Arrange-Act-Assert convention and likens this to behavior-driven development's Given-When-Then vocabulary. This chapter also demonstrates principles of the chapter with brief forays into JMock and Mockito code examples and a reference to J.B. Rainsberger's blog post JMock v. Mockito, but Not to the Death.

Chapter 4: Readability

The fourth chapter of Effective Unit Testing is the first chapter of Part 2 ("Catalog"). As with all three chapters in Part 2, Chapter 4 looks at "test smells" that might indicate tests that are less effective than they could be. In this chapter's case, the test smells are those most closely associated with problems related to readability of unit tests.

Koskela starts Chapter 4 by articulating the difference between reading test code and running test code: "Reading the tests ... should provide the programmer with an understanding of what the code should do. Running those tests should tell the programmer what the code actually does."

The test smells that Koskela associates most closely with the "readability" portion of his Test Smells Catalog are:

  • Primitive Assertions
    • Assertion that "uses more primitive elements than the behavior it's checking"
    • Analogous to the primitive obsession code smell
  • Bitwise Assertions
    • Special case of Primitive Assertions that uses bitwise operators for "optimized test assertions" at the cost of readability and understandability
  • Hyperassertions
    • Assertion that that "becomes brittle and hides its intent under its overwhelming breadth and depth"
  • Incidental Details
    • Incidental details make it difficult to identify the "intent, purpose, and meaning" of a unit test
  • Split Personality
  • Split Logic
    • Test code scattered over multiple files
  • Magic Numbers
    • Using numeric and String literals rather than using constants and variables with readable names
  • Setup Sermon
    • Too much code (often refactored from tests suffering incidental details smell) in the setup method
  • Overprotective Tests
    • Application of unnecessary/redundant guard or test assertions when condition would fail anyway

In each test smell case, Koskela provides examples of these test smells along with one or more ways (description and code examples) of addressing the smells.

Hamcrest is introduced in Chapter 4 as a partial solution to addressing test smells. There is also a unit test example that is built for testing JRuby source code.

In the fourth chapter, Koskela provides some memorable quotes. He articulates an opinion that I've long held regarding unit test code: "When weighing alternatives to expressing intent in test code, you should keep in mind that the nature and purpose of tests puts a higher value on readability and clarity than, say, code duplication or performance." He also writes, "A test that has never failed is of little value - it's probably not testing anything. On the other end of the spectrum, a test that always fails is a nuisance." Koskela also explains that "A test should have only one reason to fail" and explains that this is related to the Single Responsibility Principle.

Chapter 5: Maintainability

Chapter 5 continues the coverage of test smells, but moves from the focus of Chapter 4 on "readability" to focus instead of "maintainability." As he did in Chapter 4, Koskela enumerates several test smells most closely associated with test maintainability and uses code examples to demonstrate these smells and how to address these smells.

Chapter 5 focuses on the following "maintainability" test smells:

  • Duplication
    • Needless repetition that increases places where same change must be made and increases risk of not changing all necessary code
    • Duplication can be structural or semantic or both
  • Conditional Logic
    • "Conditional execution structures such as if, else, for, while, and switch" reduce the ability to use tests to "understand what the code does and what it should do"
  • Flaky Test
    • Tests that "fail intermittently," typically due to multithreading or race conditions
  • Crippling File Path
    • Absolute paths, especially hard-coded absolute paths, prevent unit tests from being run on others' machines
  • Persistent Temp Files
    • Files that are generated by unit tests may be less temporary than one realizes and interfere with later tests
  • Sleeping Snail
  • Pixel Perfection
    • Specialized version of Primitive Assertion and Magic Numbers test smells applied to exactly matching graphic representations in unit tests
  • Parameterized Mess
  • Lack of Cohesion in Methods
    • "Test methods in a test class are only interested in some of the fixture's objects"

As he did for the test smells covered in Chapter 5, Koskela provides code-based examples of each test smell discussed in Chapter 5 along with code-based examples of how to address each of the code smells.

Chapter 6: Trustworthiness

Chapter 6 finishes off Part 2 and the Catalog of Test Smells. Chapter 6's focus is on test smells closely associated with the degree of reliability and trustworthiness of tests. The test smells covered in this chapter are:

  • Commented-out Tests
    • Commenting out tests' implementations so they appear to pass when they really aren't run at all ("poor man's version control")
    • Removing @Test annotation from JUnit 4-based unit test method has same negative effect
    • Use of @Ignore (JUnit) is similar
  • Misleading Comments
    • Confuse what the tests are really testing (a famous code smell as well as test smell)
  • Never-failing Tests
    • "A test that can never fail is probably worse than not having that test ... Tests are supposed to fail when they should."
  • Shallow Promises
    • A "tests that does much less than what it says it does - or does nothing at all"
    • One type of "shallow promise" test smell (commenting out the body of the test method but not its signature and @Test annotation) misleads people to think test of something significant passed
    • Lack of assertions in a test method is another specific example of this test smell
    • Method name implying different functionality tested than what is actually tested is third example
  • Lowered Expectations
    • "Tests that are overly robust - they don't fail when they should," usually because "assertions are too vague"
  • Platform Prejudice
    • "Failure to treat all platforms equal"
  • Conditional Tests
    • "... conditional tests in our tests ... [are] bad in general"
    • "All branches in a test method should have a chance to fail"

Koskela uses code samples to illustrate the Chapter 6 code smells and what unit tests look like once the test smells are addressed with his recommendations. Chapter 6 wraps up Part 2 on the Catalog of Test Smells.

Chapter 7: Testable Design

Proponents of Test-Driven Development and even many proponents of any type of unit testing argue that unit testing is as much about design as it is about testing. As mentioned earlier, Koskela wrote about the Law of Two Plateaus in Chapter 1 and explained that significantly more benefit can be obtained from unit tests when they are used for more than quality assurance and are used in the design process. Chapter 7 returns to this idea and focuses on what it means to be a "testable design." Koskela defines testable design as "[making] it easy to instantiate classes, substitute implementations, simulate different scenarios, and invoke particular execution paths from our test code."

Koskela describes several design principles that lead to testable design: modular design, SOLID principles [Single Responsibility Principle, Open Closed Principle, Liskov Substitution Principle, Interface Segregation Principle, Dependency Inversion Principle]. Koskela discusses these principles of object-oriented design and how software that adheres to them is inherently more testable./p>

Koskela articulates an advantage of writing unit tests early that I've definitely observed: "The act of writing tests before the implementation they call for is essentially a way to ensure that you're taking the client's view on the code you're shaping."

Section 7.2 of Effective Unit Testing looks at several "testability issues" and then Section 7.3 addresses these testability issues with "guidelines for testable design." It is worth nothing here that Section 7.2 and Section 7.3 (and really the entire book) make the assumption that these Java unit tests are run without use of reflection or bytecode manipulation. Using these tools, such as provided by PowerMock, would be one way to address some of these issues, but do add complexity to the unit tests.

I expected Section 7.3 to be the most controversial part of the book for me. It's not that I expected the book to be controversial; rather, it is the topic that is controversial. It has always bothered me when I have to change my production code design for no other reason than to accommodate a unit test. In practical terms, I often cave in and do this because the negative effect on my production code design is less significant than the benefit of unit testing, but I still don't have to like doing it.

In Section 7.3, Koskela describes some of the commonly-held assertions about testable Java code, including avoiding complex private methods, avoiding final methods, avoiding static methods, avoiding logic in constructors, avoiding the singleton, favoring composition over inheritance, and wrapping external libraries. I started reading this section expecting to think, "Yeah, but ...." Instead, I found myself largely agreeing with Koskela's arguments based on my own experience. His pragmatic attitude made his assertions in this section agreeable and realistic. For example, he advises to avoid "complex private methods" rather than simply stating avoid private methods altogether. I've always liked when good design principles are also good for testability and Koskela manages to couch most of these "testable design principles" as such.

Chapter 8: Writing Tests in Other JVM Languages

I'm a big fan of Groovy (127 blog posts to date labeled Groovy) and am increasingly interested in Scala, so I looked forward to reading Chapter 8 on unit testing with alternative JVM languages. Koskela provides a brief history of languages other than Java on the JVM and covers some benefits common to these alternative languages.

Although Koskela mentions Scala, Clojure, JRuby, Jython and even non-JVM Ruby in Chapter 8, the lion's share of the chapter is on testing with Groovy and Groovy-based tools. In particular, the chapter covers using Groovy for testing directly as well as using Groovy-based BDD testing frameworks easyb and Spock Framework.

Chapter 9: Speeding Up Test Execution

There are numerous factors that can demotivate Java developers when it comes to unit testing. Perhaps none is more significant than slowly executing tests. If running the tests starts to take too long, the developer might lose interest in running them as often. As they are used (run) less, the developer may start to question the investment of time and energy in writing and maintaining them. Also, as they are run less, it becomes increasingly likely that problems the test would flag will go on longer before being caught.

Koskela covers ideas for improving unit test execution performance in the ninth and final chapter of Effective Unit Testing. He begins by looking at why it is important to have fast tests (both builds and test execution) and follows that with looking at strategies to make the builds and execution of tests quicker.

Chapter 9 includes brief coverage of using Ant and Maven for building unit tests and how to profile build performance with both of those tools. Until reading this chapter I was not aware of Ant's built-in ProfileLogger (since 1.8) or Ant-Contrib. This chapter also demonstrates how to use Koskela's maven-build-utils extension. Ant's JUnitReport task and Maven's maven-surefire-plugin are also demonstrated in this chapter.

Besides covering build and execution profiling tools and how to use them to identify the tests that really need to have their performance addressed, Koskela also provides several tactical approaches one can use to improve test execution efficiency. These are solid ideas that I don't consider premature optimization because they are, for the most part, simply good ideas in general that potentially improve performance without sacrificing readability or maintainability of the tests. Another thing I like about these tactical approaches is that many of them have been covered in a slightly different perspective earlier in the book. This ending chapter now brings those previously discussed principles back into the discussion for improving test performance while at the same time addressing test smells.

All of Koskela's tactical suggestions for better performing unit tests make sense to me, but I particularly liked his coverage on database access in unit tests because I so often see this violated at extreme cost. I like his emphasized statement: "friends don't let friends use a database in their unit tests." He explains (and I agree) that integration tests are more appropriate for testing of actual database access.

There is a table in Chapter 9 (Table 9-1) that summarizes approaches one might take to address the two primary external constraints on unit test execution performance (CPU and I/O). Koskela then moves onto detailing how to implement some of these mitigation approaches. Several tools are covered in this chapter including some Linux tools, but two of the most interesting and new to me were Koskela's descriptions of how to use Amazon Web Services and GridGain to improve unit test building performance.

Appendices

Appendix A is a 7-page "JUnit Primer" that covers basics of JUnit with focus on assertions and use of Hamcrest matchers. Appendix B is a little over 6 pages on "Extending JUnit" with focus on runners and rules.

Test Smells

For me, the heart of Effective Unit Testing is Part 2 (plus Chapter 7 and Chapter 9 of Part 3 which I highlight next) that catalogs the test smells. This Part 2 is great at outlining issues I've run into with unit tests that can make unit testing more painful than it needs to be. It is also easy to see that there are trade-offs between the test smells such that the very approach that alleviates one test smell may increase risk of introducing another test smell. One example is the trade-off between duplication and readability in unit tests. The author talks about many of these trade-offs and provides rough guidelines of how to decide which way to lean without going too far in either direction. In the case of unit test readability at the cost of some duplication, I like stucampbell's take on this: "I'm more likely to refactor duplicated code for setting up state. But less likely to refactor the part of the test that actually exercises the code."

Testable Design and JUnit Performance

Like all three chapters in Part 2 on test smells, two of the chapters in Part 3 (Chapter 7 on testable design and Chapter 9 on unit test performance) are also chapters that I plan to re-read in the future. There is a lot packed into these chapters that directly address common unit testing issues as well as sparking ideas about other approaches that could be used to improve unit testing.

The Audience

As one would expect from a book with the title Effective Unit Testing, this book does indeed provide guidance on what separates effective unit tests from less effective unit tests. Not only does it cover practices one should use and should avoid, but it introduces terminology and cites well-known resources in the unit testing literature. As such, it is not only an appropriate book for someone with basic familiarity with unit testing who wants to improve their unit tests, but is also highly relevant to those new to unit testing in general and unit testing in Java in particular who want an overview of unit testing in Java. This book doesn't have enough details to be the only source of information for someone new to unit testing in Java, but does give the overall high-level view that can provide context for reading literature more specifically focused on frameworks and other unit testing tools.

Part 1 will be least interesting to those with significant experience with unit testing, but will be of high value to those who are new to unit testing or non-coding managers and leads who want an overview of why unit testing is important and what kinds of high-level things can make unit tests less effective or more effective. Although Part 1 was the least interesting to me, I still felt it was well-written and met the author's stated goal of providing a common context for the remainder of the book.

Part 2 and Part 3 are more technically detailed than Part 1. I like the overall descriptions of test smells and approaches to scrub these test smells in Part 2. I also really like the coverage in Part 3 of ideas for improving test build and execution performance and related to how to design code that is more testable.

What This Book is Not

As I have stated in this post and as the author reminds the reader of Effective Unit Testing, this book is not the book one should use to learn details of JUnit or other testing framework. There is just enough coverage of these to illustrate more general points, but the focus is on unit testing principles rather than on unit testing implementation libraries and frameworks. Although someone new to unit testing Java applications could find this book useful (particularly Part 1), some knowledge of JUnit would be beneficial (particularly for Part 2 and Part 3).

The Book Advantage

I have found that different mediums have different advantages when it comes to conveying information and learning. Although many of the principles found in this book are available online in various forms, the strength of the book is the author's organization and articulation of the ideas in a single coherent source. I think the book is well worth its price when I think of the time it would take to collect and organize these ideas from online sources. Furthermore, the author provides examples from his own and friends' experiences to add a "real life" feeling to it all. A well-written book's advantage over blog posts, forum threads, and the like is the ability to cohesively and coherently cover a topic with breadth and depth. This book does just that for the topic of unit testing in Java. To me this book may not offer a lot of new high-level concepts (although it does offer some new to me low-level details), but it articulates well the ideas and practices that seem to be emerging from collective experience writing unit tests for Java-based applications.

Breadth of References

Another thing I liked about Effective Unit Testing is the abundance of reference to sources with additional details on the ideas, concepts, tools, and frameworks referenced in the book. Some might argue that a downside of the book is that most of the concepts are not new. I actually argue the opposite: because these are practices for effective unit tests, one would expect them to be based on what more than one person has found to be effective through hard experience. When I am reading a book on "effective" anything, I'm not looking for something that is simply "new" or "different"; I am looking for things I should generally do and generally not do and why. By referencing others' work in unit testing as well as describing his own efforts in this area, Koskela increases the credibility of his book. I also liked the mixing of alternative languages, frameworks, and operating systems in the examples.

Breadth of Coverage

In 201 pages (not counting appendices, preface, etc.), Koskela articulates and demonstrates what has taken me years of experience to learn from writing and maintaining unit tests and from reading about unit testing. Even though the high-level concepts were not really new to me, I still learned several tactical approaches from this book. Besides learning some new tactics to employ to implement the concepts of unit testing in Java that I thought I already knew, this book has sparked additional ideas for improving my unit tests and has reinvigorated my interest in writing better unit tests.

Pragmatic Advice

I liked Koskela's pragmatism in this book. Some unit testing enthusiasts (or test-infected Java developers as he calls them in the book) can be overbearing in their enthusiasm and evangelism to the point where it's difficult to believe their claims. Koskela is obviously enthusiastic about unit testing, but seems to keep in mind one important truth: unit tests exist to benefit the quality of the design and code (the production code does not exist to benefit the unit tests). Koskela points out, for example, that there are cases where redundant unit test code might be easier to read and maintain than rigorously implementing DRY principles within test code at any cost.

Conclusion

I was a little apprehensive when I purchased Effective Unit Testing as part of Manning's MEAP, but I am glad that I did. This book delivered what I hoped for and I found Part 2 and Chapter 7 and Chapter 9 of Part 3 to be particularly useful for someone in my situation (relatively experienced Java developer looking for ideas to improve his or her unit testing). Although there wasn't a lot new to me at the highest level, there were a lot of interesting lower-level details that were new to me or were presented from a unique and interesting perspective. I also liked having these ideas I had from my own experiences laid out and articulated for me in printed form and I liked having these concepts being developed by the Java unit testing community all codified in a single book. The book packs a lot into 201 pages of regular text and the writing style is easy to read and understand. It is easy for me to recommend this book to Java developers who feel they have room to improve in writing of unit tests of Java applications. I also know many Java developers (including myself) who could benefit from reading this book.

No comments: