Don't Be Fooled By The Coverage Report
From NeoWiki
(→The trouble with conditionals) |
|||
Line 259: | Line 259: | ||
As you may well know by now, many variables found in code can have more than one state; furthermore, the presence of conditionals creates multiple paths of execution. With these caveats in mind, I've defined an absurdly simple class with one method in Listing 5: | As you may well know by now, many variables found in code can have more than one state; furthermore, the presence of conditionals creates multiple paths of execution. With these caveats in mind, I've defined an absurdly simple class with one method in Listing 5: | ||
+ | |||
+ | '''Listing 5. Do you see the defect below?''' | ||
+ | package com.vanward.coverage.example01; | ||
+ | |||
+ | public class PathCoverage { | ||
+ | |||
+ | public String pathExample(boolean condition){ | ||
+ | String value = null; | ||
+ | if(condition){ | ||
+ | value = " " + condition + " "; | ||
+ | } | ||
+ | return value.trim(); | ||
+ | } | ||
+ | } | ||
+ | |||
+ | Listing 5 has an insidious defect in it -- do you see it? If not, no worries: I'll just write a test case to exercise the pathExample() method and ensure it works correctly in Listing 6: | ||
+ | |||
+ | '''Listing 6. JUnit to the rescue!''' | ||
+ | package test.com.vanward.coverage.example01; | ||
+ | |||
+ | import junit.framework.TestCase; | ||
+ | import com.vanward.coverage.example01.PathCoverage; | ||
+ | |||
+ | public class PathCoverageTest extends TestCase { | ||
+ | |||
+ | public final void testPathExample() { | ||
+ | PathCoverage clzzUnderTst = new PathCoverage(); | ||
+ | String value = clzzUnderTst.pathExample(true); | ||
+ | assertEquals("should be true", "true", value); | ||
+ | } | ||
+ | } | ||
+ | |||
+ | My test case runs flawlessly and my handy-dandy code coverage report (below in Figure 3) makes me look like a superstar, with 100% test coverage! | ||
+ | |||
+ | '''Figure 3. Rock star coverage, baby!'''<br /> | ||
+ | [[Image:The_Cobertura_Report-3.gif]] | ||
+ | |||
+ | I'm thinking it's time to go hang out by the water cooler, but wait -- didn't I suspect a defect in that code? Closer examination of Listing 5 shows that Line 13 will indeed throw a NullPointerException if condition is false. Yeesh, what happened here? | ||
+ | |||
+ | It turns out that line coverage isn't such a great indicator of test effectiveness. | ||
+ | |||
+ | ==The horror of paths== | ||
+ | |||
+ | In Listing 7, I've defined another simple example with an indirect, yet flagrant defect. Note the second half of the if conditional found in the branchIt() method. (The HiddenObject class is defined in Listing 8.) | ||
+ | |||
+ | '''Listing 7. This code is simple enough''' | ||
+ | package com.vanward.coverage.example02; | ||
+ | |||
+ | import com.acme.someotherpackage.HiddenObject; | ||
+ | |||
+ | public class AnotherBranchCoverage { | ||
+ | |||
+ | public void branchIt(int value){ | ||
+ | if((value > 100) || (HiddenObject.doWork() == 0)){ | ||
+ | this.dontDoIt(); | ||
+ | }else{ | ||
+ | this.doIt(); | ||
+ | } | ||
+ | } | ||
+ | |||
+ | private void dontDoIt(){ | ||
+ | // don't do something... | ||
+ | } | ||
+ | |||
+ | private void doIt(){ | ||
+ | // do something! | ||
+ | } | ||
+ | } | ||
+ | |||
+ | Yikes! The HiddenObject in Listing 8 is evil. Calling the doWork() method as I did in Listing 7 yields a RuntimeException: | ||
+ | |||
+ | '''Listing 8. Uh oh!''' | ||
+ | package com.acme.someotherpackage.HiddenObject; | ||
+ | |||
+ | public class HiddenObject { | ||
+ | |||
+ | public static int doWork(){ | ||
+ | // return 1; | ||
+ | throw new RuntimeException("surprise!"); | ||
+ | } | ||
+ | } | ||
+ | |||
+ | But surely I can catch the exception with a nifty test! In Listing 9, I've written another sunny-day test in an attempt to win my way back to rock stardom: | ||
+ | |||
+ | '''Listing 9. Risk avoidance with JUnit''' | ||
+ | package test.com.vanward.coverage.example02; | ||
+ | |||
+ | import junit.framework.TestCase; | ||
+ | import com.vanward.coverage.example02.AnotherBranchCoverage; | ||
+ | |||
+ | public class AnotherBranchCoverageTest extends TestCase { | ||
+ | |||
+ | public final void testBranchIt() { | ||
+ | AnotherBranchCoverage clzzUnderTst = new AnotherBranchCoverage(); | ||
+ | clzzUnderTst.branchIt(101); | ||
+ | } | ||
+ | } | ||
+ | |||
+ | What do you think of this test case? You probably would have written a few more test cases than I did, but imagine if that dubious conditional in Listing 7 had more than one short-circuit operation. Imagine if the logic in the first half was a bit more cerebral than a simple int comparison -- how many test cases would you write before you were satisfied? | ||
+ | |||
+ | ===Just give me the numbers=== | ||
+ | |||
+ | The results of my test coverage analysis of Listings 7, 8, and 9 shouldn't be any surprise to you by now. The report in Figure 4 shows that I've achieved 75% line coverage and 100% branch coverage. Most important, I've exercised line 10! | ||
+ | |||
+ | Figure 4. The fools reward | ||
+ | [[Image:The_Cobertura_Report-4.gif]] | ||
+ | |||
+ | Boy am I proud, at least on first reflection. But do you see what's misleading about this report? A cursory look could lead you to believe the code was well tested. Based on that, you would probably assume the risk of a defect to be quite low. The report does little-to-nothing to help you ascertain that the second half of the or short-circuit is a ticking time bomb! | ||
+ | |||
+ | ==Testing for quality== | ||
==About the author== | ==About the author== |
Revision as of 14:43, 5 March 2007
- Are your test coverage measurements leading you astray?
Andrew Glover, President, Stelligent Incorporated
31 Jan 2006
- Test coverage tools bring valuable depth to unit testing, but they're often misused. This month, Andrew Glover brings his considerable expertise in this area to his new series, In pursuit of code quality. This first installment takes a closer look at what the numbers on the coverage report really mean, as well as what they don't. He then suggests three ways you can use your coverage to ensure code quality early and often.
Do you remember what it was like before most developers jumped on the code quality bandwagon? In those days, a skillfully placed main() method was considered both agile and adequate for testing. Yikes! We've come a long way since then. I, for one, am immensely grateful that automated testing is now an essential facet of quality-centric code development. And that's not all I'm grateful for. Java™ developers have a plethora of tools for gauging code quality through code metrics, static analysis, and more. Heck, we've even managed to categorize refactoring into a handy set of patterns!
All these new tools make it easier than ever to ensure code quality, but you do have to know how to use them. In this series of articles, I'll focus on the sometimes arcane details of ensuring code quality. In addition to familiarizing you with the variety of tools and techniques available for code quality assurance, I'll show you how to:
- Define and effectively measure the aspects of your code that most impact quality.
- Set quality assurance goals and plan your development efforts accordingly.
- Decide which code quality tools and techniques actually meet your needs.
- Implement best practices (and weed out poor ones) so that ensuring code quality early and often becomes a painless and effective aspect of your development practice.
I'll start this month with a look at one of the most popular and easy additions to a Java developer's quality assurance toolkit: test coverage measurement.
Contents |
Beware fools gold
It's the morning following a nightly build and everyone's standing around the water cooler. Developers and management alike are exchanging bold NFL-style pats on the backside when they learn that a few particularly well-tested classes have coverage rates in the high 90s! The collective confidence of the team is at an all-time high. "Refactor with abandon!" can be heard in the distance as defects become a distant memory and the responsibility of the weak and inferior. But there is one small, dissenting voice that says:
Ladies and Gentlemen: Don't be fooled by the coverage report.
Now, don't get me wrong: There's nothing foolish about using test coverage tools. They're a great addition to the unit testing paradigm. What's important is how you synthesize the information once you've got it, and that's where some development teams make their first mistake.
High coverage rates simply mean that a lot of code was exercised. High coverage rates do not imply that code was exercised well. If you're focusing on code quality, you need to understand exactly how test coverage tools work, as well as how they don't; then you'll know how to use these tools to obtain valuable information, rather than just settling for high coverage goals, as many developers do.
Test coverage measurement
Test coverage tools are generally easy to add into an established unit testing process, and the results can be reassuring. Simply download one of the available tools, slightly modify your Ant or Maven build script, and you and your colleagues have a new kind of report to talk about around the water cooler: The Test Coverage Report. It can be a big comfort when packages like foo and bar show astonishingly high coverage, and it's tempting to rest easy when you believe that at least a portion of your code is certifiably "bug free." But to do so would be a mistake.
There are different types of coverage measurements, but most tools focus on line coverage, also known as statement coverage. In addition, some tools report branch coverage. A test coverage measurement is obtained by exercising a code base with a test harness and capturing data that corresponds to code having been "touched" throughout the lifetime of the test process. The data is then synthesized to produce a coverage report. In Java shops, the test harness is commonly JUnit and the coverage tool is usually something like Cobertura, Emma, or Clover, to name a few.
Line coverage simply indicates that a particular line of code was exercised. If a method is 10 lines long and 8 lines of the method were exercised in a test run, then the method has a line coverage of 80%. The process works at the aggregate level as well: If a class has 100 lines and 45 of them were touched, then the class has a line coverage of 45%. Likewise, if a code base comprises 10,000 non-commenting lines of code and 3,500 of them were exercised on a particular test run, then the code base's line coverage is 35%.
Tools that report branch coverage attempt to measure the coverage of decision points, such as conditional blocks containing logical ANDs or ORs. Just like line coverage, if there are two branches in a particular method and both were covered through tests, then you could say the method has 100% branch coverage.
The question is, how useful are these measurements? Clearly, all of this information is easy to obtain, but it's up to you to be discerning about how you synthesize it. Some examples clarify my point.
Code coverage in action
I've created a simple class in Listing 1 to embody the notion of a class hierarchy. A given class can have a succession of superclasses -- like Vector, whose parent is AbstractList, whose parent is AbstractCollection, whose parent is Object:
Listing 1. A class that represent a class hierarchy
package com.vanward.adana.hierarchy; import java.util.ArrayList; import java.util.Collection; import java.util.Iterator; public class Hierarchy { private Collection classes; private Class baseClass; public Hierarchy() { super(); this.classes = new ArrayList(); } public void addClass(final Class clzz){ this.classes.add(clzz); } /** * @return an array of class names as Strings */ public String[] getHierarchyClassNames(){ final String[] names = new String[this.classes.size()]; int x = 0; for(Iterator iter = this.classes.iterator(); iter.hasNext();){ Class clzz = (Class)iter.next(); names[x++] = clzz.getName(); } return names; } public Class getBaseClass() { return baseClass; } public void setBaseClass(final Class baseClass) { this.baseClass = baseClass; } }
As you can see, Listing 1's Hierarchy class holds a baseClass instance and its collection of superclasses. The HierarchyBuilder in Listing 2 creates the Hierarchy class through two overloaded static methods dubbed buildHierarchy:
Listing 2. A class hierarchy builder
package com.vanward.adana.hierarchy; public class HierarchyBuilder { private HierarchyBuilder() { super(); } public static Hierarchy buildHierarchy(final String clzzName) throws ClassNotFoundException{ final Class clzz = Class.forName(clzzName, false, HierarchyBuilder.class.getClassLoader()); return buildHierarchy(clzz); } public static Hierarchy buildHierarchy(Class clzz){ if(clzz == null){ throw new RuntimeException("Class parameter can not be null"); } final Hierarchy hier = new Hierarchy(); hier.setBaseClass(clzz); final Class superclass = clzz.getSuperclass(); if(superclass != null && superclass.getName().equals("java.lang.Object")){ return hier; }else{ while((clzz.getSuperclass() != null) && (!clzz.getSuperclass().getName().equals("java.lang.Object"))){ clzz = clzz.getSuperclass(); hier.addClass(clzz); } return hier; } } }
It's testing time!
What would an article about test coverage be without a test case? In Listing 3, I define a simple, sunny-day scenario JUnit test class with three test cases, which attempt to exercise both the Hierarchy and HierarchyBuilder classes:
Listing 3. Test that HierarchyBuilder!
package test.com.vanward.adana.hierarchy; import com.vanward.adana.hierarchy.Hierarchy; import com.vanward.adana.hierarchy.HierarchyBuilder; import junit.framework.TestCase; public class HierarchyBuilderTest extends TestCase { public void testBuildHierarchyValueNotNull() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertNotNull("object was null", hier); } public void testBuildHierarchyName() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertEquals("should be junit.framework.Assert", "junit.framework.Assert", hier.getHierarchyClassNames()[1]); } public void testBuildHierarchyNameAgain() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertEquals("should be junit.framework.TestCase", "junit.framework.TestCase", hier.getHierarchyClassNames()[0]); } }
Because I'm an avid tester, I naturally want to run some coverage tests. Of the code coverage tools available to Java developers, I tend to stick with Cobertura because I like its friendly reports. Also, Cobertura is an open source project, which forked the pioneering JCoverage project.
The Cobertura report
Running a tool like Cobertura is as simple as running your JUnit tests, only with a middle step of instrumenting the code under test with specialized logic to report on coverage (which is all handled via the tool's Ant tasks or Maven's goals).
As you can see in Figure 1, the coverage report for HierarchyBuilder illustrates a few sections of code that weren't exercised. In fact, Cobertura claims that HierarchyBuilder has as 59% line coverage rate and a 75% branch coverage rate.
Figure 1. The Cobertura report
So, my first shot at coverage testing failed to test a number of things. First, the buildHierarchy() method, which takes a String as a parameter, wasn't tested at all. Second, two conditions in the other buildHierarchy() method weren't exercised either. Interestingly, it's the second unexercised if block that is a concern.
I'm not worried at this point because all I have to do is add a few more test cases. Once I've reached those areas of concern, I should be good to go. Note my logic here: I used the coverage report to understand what wasn't tested. Now I have the option to use this data to either enhance my testing or move on. In this case, I'm going to enhance my testing, because I've left a few important areas uncovered.
Cobertura: Round 2
Listing 4 is an updated JUnit test case that adds a handful of additional test cases in an attempt to fully exercise HierarchyBuilder:
Listing 4. An updated JUnit test case
package test.com.vanward.adana.hierarchy; import com.vanward.adana.hierarchy.Hierarchy; import com.vanward.adana.hierarchy.HierarchyBuilder; import junit.framework.TestCase; public class HierarchyBuilderTest extends TestCase { public void testBuildHierarchyValueNotNull() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertNotNull("object was null", hier); } public void testBuildHierarchyName() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertEquals("should be junit.framework.Assert", "junit.framework.Assert", hier.getHierarchyClassNames()[1]); } public void testBuildHierarchyNameAgain() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertEquals("should be junit.framework.TestCase", "junit.framework.TestCase", hier.getHierarchyClassNames()[0]); } public void testBuildHierarchySize() { Hierarchy hier = HierarchyBuilder.buildHierarchy(HierarchyBuilderTest.class); assertEquals("should be 2", 2, hier.getHierarchyClassNames().length); } public void testBuildHierarchyStrNotNull() throws Exception{ Hierarchy hier = HierarchyBuilder.buildHierarchy("test.com.vanward.adana.hierarchy.HierarchyBuilderTest"); assertNotNull("object was null", hier); } public void testBuildHierarchyStrName() throws Exception{ Hierarchy hier = HierarchyBuilder.buildHierarchy("test.com.vanward.adana.hierarchy.HierarchyBuilderTest"); assertEquals("should be junit.framework.Assert", "junit.framework.Assert", hier.getHierarchyClassNames()[1]); } public void testBuildHierarchyStrNameAgain() throws Exception{ Hierarchy hier = HierarchyBuilder.buildHierarchy("test.com.vanward.adana.hierarchy.HierarchyBuilderTest"); assertEquals("should be junit.framework.TestCase", "junit.framework.TestCase", hier.getHierarchyClassNames()[0]); } public void testBuildHierarchyStrSize() throws Exception{ Hierarchy hier = HierarchyBuilder.buildHierarchy("test.com.vanward.adana.hierarchy.HierarchyBuilderTest"); assertEquals("should be 2", 2, hier.getHierarchyClassNames().length); } public void testBuildHierarchyWithNull() { try{ Class clzz = null; HierarchyBuilder.buildHierarchy(clzz); fail("RuntimeException not thrown"); }catch(RuntimeException e){} } }
When I run the test coverage process again with the new test cases, I get a much more complete report, as shown in Figure 2. I've now covered the untested buildHierarchy() method as well as hitting both if blocks in the other buildHierarchy() method. HierarchyBuilder's constructor is private, however, so I can't test it via my test class (nor do I care to); therefore, my line coverage still hovers at 88%.
Figure 2. Who says there are no second chances?
As you can see, using a code coverage tool can uncover important code that doesn't have a corresponding test case. The important thing is to exercise caution when viewing the reports (especially ones with high values), for they can hide nefarious subtleties. Let's look at a couple more examples of code issues that can hide behind high coverage rates.
The trouble with conditionals
As you may well know by now, many variables found in code can have more than one state; furthermore, the presence of conditionals creates multiple paths of execution. With these caveats in mind, I've defined an absurdly simple class with one method in Listing 5:
Listing 5. Do you see the defect below?
package com.vanward.coverage.example01; public class PathCoverage { public String pathExample(boolean condition){ String value = null; if(condition){ value = " " + condition + " "; } return value.trim(); } }
Listing 5 has an insidious defect in it -- do you see it? If not, no worries: I'll just write a test case to exercise the pathExample() method and ensure it works correctly in Listing 6:
Listing 6. JUnit to the rescue!
package test.com.vanward.coverage.example01; import junit.framework.TestCase; import com.vanward.coverage.example01.PathCoverage; public class PathCoverageTest extends TestCase { public final void testPathExample() { PathCoverage clzzUnderTst = new PathCoverage(); String value = clzzUnderTst.pathExample(true); assertEquals("should be true", "true", value); } }
My test case runs flawlessly and my handy-dandy code coverage report (below in Figure 3) makes me look like a superstar, with 100% test coverage!
Figure 3. Rock star coverage, baby!
I'm thinking it's time to go hang out by the water cooler, but wait -- didn't I suspect a defect in that code? Closer examination of Listing 5 shows that Line 13 will indeed throw a NullPointerException if condition is false. Yeesh, what happened here?
It turns out that line coverage isn't such a great indicator of test effectiveness.
The horror of paths
In Listing 7, I've defined another simple example with an indirect, yet flagrant defect. Note the second half of the if conditional found in the branchIt() method. (The HiddenObject class is defined in Listing 8.)
Listing 7. This code is simple enough
package com.vanward.coverage.example02; import com.acme.someotherpackage.HiddenObject; public class AnotherBranchCoverage { public void branchIt(int value){ if((value > 100) || (HiddenObject.doWork() == 0)){ this.dontDoIt(); }else{ this.doIt(); } } private void dontDoIt(){ // don't do something... } private void doIt(){ // do something! } }
Yikes! The HiddenObject in Listing 8 is evil. Calling the doWork() method as I did in Listing 7 yields a RuntimeException:
Listing 8. Uh oh!
package com.acme.someotherpackage.HiddenObject; public class HiddenObject { public static int doWork(){ // return 1; throw new RuntimeException("surprise!"); } }
But surely I can catch the exception with a nifty test! In Listing 9, I've written another sunny-day test in an attempt to win my way back to rock stardom:
Listing 9. Risk avoidance with JUnit
package test.com.vanward.coverage.example02; import junit.framework.TestCase; import com.vanward.coverage.example02.AnotherBranchCoverage; public class AnotherBranchCoverageTest extends TestCase { public final void testBranchIt() { AnotherBranchCoverage clzzUnderTst = new AnotherBranchCoverage(); clzzUnderTst.branchIt(101); } }
What do you think of this test case? You probably would have written a few more test cases than I did, but imagine if that dubious conditional in Listing 7 had more than one short-circuit operation. Imagine if the logic in the first half was a bit more cerebral than a simple int comparison -- how many test cases would you write before you were satisfied?
Just give me the numbers
The results of my test coverage analysis of Listings 7, 8, and 9 shouldn't be any surprise to you by now. The report in Figure 4 shows that I've achieved 75% line coverage and 100% branch coverage. Most important, I've exercised line 10!
Boy am I proud, at least on first reflection. But do you see what's misleading about this report? A cursory look could lead you to believe the code was well tested. Based on that, you would probably assume the risk of a defect to be quite low. The report does little-to-nothing to help you ascertain that the second half of the or short-circuit is a ticking time bomb!
Testing for quality
About the author
Andrew Glover is president of Stelligent Incorporated, a JNetDirect company. Stelligent Incorporated helps companies address software quality early with effective developer testing strategies and frameworks, software metric analysis and correlations, and continuous integration techniques that enable development teams and management to monitor code quality continuously. He is the co-author of Java Testing Patterns (Wiley, September 2004).
- "When you have learned to snatch the error code from the trap frame, it will be time for you to leave.", thus spake the master programmer.