The Seven Laws of Developer Testing

The case for practicing developer testing is very compelling but, as I have already mentioned, starting and running a successful developer testing program is easier said than done. However, it can be done.

In the past three years, I have had the opportunity to observe and talk to dozens of software development organizations and individuals about their experiences with developer testing. On top of that, I have personally started and managed developer testing programs during my tenure at Google as Director of Engineering and, of course, at Agitar as VP of Engineering. From all these experiences, I have distilled seven basic laws that must be followed in order to have a successful, efficient and effective developer testing program. These are:

1. Law of Management Commitment;
2. Law of Team Buy-In;
3. Law of Metrics;
4. Law of Targets;
5. Law of Training and Coaching;
6. Law of Automation; and
7. Law of Failure.

Let us go through each of these laws in detail.

Law of Management Commitment

A successful developer testing practice requires initial and ongoing management support and commitment.

The practice of developer testing requires an initial dose of courage, conviction and investment to get started, and an ongoing dose of the same to keep it going successfully. I have seen developer testing programs succeed without the team’s management as the driving force, but I have not seen any of them succeed without at least a strong dose of management support. If management does not believe in the benefits of developer testing, they are not going to be very supportive of engineers wasting time writing tests when they should be coding more features. Management commitment to developer testing should be evident through the following:

· Clear, visible and tangible support of the developer testing effort: This can take many forms – having a kick-off meeting, working with the team to select the proper metrics and objectives for the effort, reviewing and publicizing the team’s progress against those targets, celebrations and awards when major milestones are reached, etc. The important thing is to let the entire team know that the organization is serious about developer testing and committed to making it work; and
· Willingness, courage and conviction to invest time and resources to ensure that the effort succeeds: Time invested in a well planned developer testing effort will pay handsome dividends, but at the beginning it requires some faith and courage to take time away from the development schedule to train the team, set up the developer testing framework and go through the initial learning curve.

Management commitment is essential but not sufficient. The rest of the team must buy in, which leads us to the second law.

Law of Team Buy-In

A successful developer testing practice requires full team buy-in.

Management commitment is necessary, but not sufficient. Even if the team’s management is 100% sold on the idea of developer testing, and willing to commit and make the necessary investment, training, etc., the program will not succeed if the rest of the team, or at least most of the team (and especially the thought leaders), is not onboard.

This law is obvious and does not require much explanation. However, one of the most common and major challenges is having the team’s management and leaders show commitment to developer testing while many developers are skeptical or downright hostile to the idea. One of my favorite quotes is: “As developers, we successfully avoided having to do testing for decades. I don’t see why we should start now.” A major culprit for this lack of motivation is that, historically, poor code quality has not affected developers as much as managers; I have never seen developers fired for poor quality software or a late delivery, but I have seen their managers fired for those reasons.

This is one situation where the management courage, conviction and commitment from the previous law come in. I don’t believe a developer testing program can be called truly successful if some developers are exonerated just because they don’t like the idea of having to write unit tests.

The first step in dealing with developers who are skeptical of, or hostile to, the preposterous idea of developer testing is to ask them to give it a try for a couple of weeks, or on a specific small or medium development project. If a developer accepts the task and acts upon it in good faith and with an open mind, they will most likely see the benefits and, perhaps still with some reluctance, agree that it is the right thing to do going forward. When I started the developer testing programs at my current and previous companies, I was certain that a couple of developers on each team would stubbornly oppose it and find every possible excuse and rationalization to be excused or avoid the testing in one way or another. Since in each of these cases the developers were top contributors and valuable team members, I was not looking forward to the prospect of losing them over this. Much to my surprise, after some initial grumbling and resistance, each of those developers became top contributors in terms of unit test quality and quantity. As it turns out, the qualities and skills that make a person a great software developer can, with some encouragement, make them great at developer testing also.

Law of Metrics

A successful developer testing practice requires a carefully chosen set of metrics.

If you cannot measure, you cannot manage.
If you cannot manage, you cannot improve.

As we have seen, a developer testing program requires initial as well as ongoing investment. Since developer time is a very valuable resource, we need to direct and manage the effort to ensure that the investment is focused on the right objectives and is achieving the desired results. The best way to achieve and maintain the right focus is to select and apply an appropriate set of metrics.

Identifying the right metrics for developer testing can be tricky. All metrics have weaknesses and none of them are perfect. No major decisions should be based purely on metrics data without taking into account common sense or circumstances that might be hard to quantify. Having said that, I have identified a set of developer testing metrics that are relatively simple, ideal for the initial phase of a developer testing practice and have been proven very effective for us and other teams that have adopted them. You can use these simple metrics to get going and add to, or refine, them over time to meet your specific requirements. The three metrics I recommend for getting started are:
· Total Test Points: Test points are scored by producing test artifacts (e.g. if you are using Java and JUnit, you can score a test point for each test class in your suite or, even better, for each test assertion in a test class). Total Test Points is a metric designed to grow quickly and continuously, giving your team an opportunity to see progress on a daily basis, feel good about it and celebrate milestones (e.g. we celebrated reaching our 10,000 test points milestone on Agitator by taking the entire team out for a nice lunch);
· Percentage of Classes with Test Points: Total test points measure the volume of tests that you have. This metric measures the distribution and breadth of those tests; and
· Percentage of Code Covered by Tests: Code coverage is the traditional metric for evaluating tests by measuring the lines of code (and/or branches, and/or conditions, etc.) that were executed by running the tests. Like all metrics, it’s not perfect; but as long as you are careful not to confuse code execution with testing (i.e. it’s possible to execute code without actually testing that it does the right thing) it’s a very useful measurement; especially if combined with the previous two. Now that you have some metrics, you are ready to set some targets for the team, but setting the right developer testing targets is important enough and tricky enough that it deserves its own law.

Law of Targets

A successful developer testing practice has to have long-term objectives as well as frequently updated short-term targets

The long-term objective of a developer testing program should be to make the creation and execution of unit tests by developers a routine and implicit part of development. Test creation should go hand-in-hand with code creation (code a little, test a little or the other way around), and test execution should happen as frequently as compilation (ideally each compile should be automatically followed by a run of unit tests.) In an organization where the practice of developer testing has fully evolved and matured, the act of writing and executing tests should be so commonplace that questions such as “Did you write and run the unit tests for this code?” are as unnecessary as asking “Did you compile this code?” The existence of tests to accompany each unit of code should be taken for granted and the exception should be code that does not have unit tests rather than the other way around. This is Developer Testing Utopia and it’s a great long-term objective. Unfortunately, you cannot reach this state overnight; it may take a year or more, depending on the history and legacies of your team and project, to even get close to it. Therefore, having a set of objective and achievable intermediate targets is critical to ensure consistent progress and provide feedback and encouragement.

When it comes to setting targets for a fledgling developer testing practice, the biggest danger is being too ambitious too soon. I recommend using the aforesaid metrics and starting slowly, to give the team time to learn the ropes of developer testing and, more importantly, gain appreciation for their benefits.

Table 1 below provides with some reasonable sample targets for a medium-sized team/project. Notice the easy start in the first quarter and the significant increase in each of the metrics each successive quarter. These targets take into account that:
1. Scoring test points and getting good code coverage gets considerably easier/faster with experience;
2. The last 10-20% of code coverage is usually harder to achieve; and 
3. There may be some legitimate reasons for not testing 100% of your code. You should change these targets based on your situation and not hesitate to change them again as you gain experience.

Table 1: Developer Testing Targets
  Q1 Q2 Q3 Q4
Total test points  1,000 4,000 10,000  20,000
% of classes w/ test points  5%    20% 50% 90%
% of code covered by unit tests  5%    20% 40% 80%

Law of Training and Coaching

A successful developer testing practice requires initial training and ongoing coaching.

It’s unfortunate, but today most computer science education programs don’t include software testing in the curriculum. A few developers, driven by internal motivation or naturally inclined towards testing, can be very creative and effective at thinking of, and writing, unit tests; but the majority needs initial direction and ongoing coaching until they achieve a basic understanding of the core principles and good proficiency with the basic skills.

The good news is that most developers are quite smart and quick learners. If you have the budget, it’s hard to beat hiring a developer testing coach or trainer for a week or so, but you can take the do-it-yourself approach and use one of the many books now available on the subject of unit testing and test-driven development (coupled with the many resources available online) and groom one or two in-house gurus who can then train the rest of the team.

In either case, it’s a good idea to eventually develop a couple of in-house experts on developer testing and the associated tools and technology, because it’s unlikely that the initial training will cover all bases. Also, it is important to ensure that there is someone who can answer questions and provide high-level oversight as the practice grows and evolves. Fortunately, my experience is that in a group of 10 or 20 developers there are always a couple of people who are (or will become) test infected and will be more than happy to serve as the in-house developer testing gurus and evangelists.

Law of Automation

A successful developer testing practice must take advantage of automation in test creation, execution and reporting.
Many tasks associated with the developer testing cycle are combinatorial and repetitive in nature. In order to be considered adequate, for example, unit tests should cover a wide range of inputs and input conditions to ensure that all the possible code behaviors are exercised, but creating all the necessary inputs and accompanying test code by hand can be very tedious and inefficient. As a result, most manually written developer tests fall short in terms of coverage, tend to concentrate on a few positive test cases, and ignore most corner and exception/error cases. Our experience (based on analyzing hundreds of manually written unit tests from dozens of Java-based projects) is that the ratio of test code to code under test required to achieve at least 90% code coverage is between 2/1 and 4/1. This means that to thoroughly test a 100-line Java class, 200-400 lines of test code are required. This can be done; but it’s very inefficient, most developers don’t like the idea of spending so much time writing test code, and, as a result, most tests fail to achieve the desired/required coverage.

Similarly, test execution, reporting and analysis must be as automated as possible. If the test execution is not automated (e.g. by automatically running the tests after each build, or at the very least nightly), the tests will not be run as frequently as necessary, greatly reducing their benefit. By the same token, if the results of the tests are not filtered and reported to the right people as soon as possible, enabling them to take appropriate action as soon as possible, they will lose a lot of their value. Fortunately, today there is a lot of technology available to automate all the major testing tasks. These include:
· Test Creation: There are tools available today that can automate the most tedious, laborious and repetitive parts associated with test creation. Modern languages like Java and C# make it possible for automated test generators to analyze existing code or interfaces and automatically create test templates which, with a minimum of input/tweaking from a developer, can achieve very impressive test coverage in a fraction of the time it would take for a fully manual effort;
· Test Execution: Automated/continuous code builds are becoming more commonplace; and tools like Ant make it easy to add automated test execution at the end of each build; and
· Test Reporting and Analysis: All testing frameworks provide some basic form of automated reporting (i.e. which tests pass or fail), but it pays to go a few steps further by including features such as code coverage reports, error reports/emails personalized for each developer, charts with time trends for the major metrics (e.g. the trend in percentage of classes with unit tests over time).

One of the key lessons I have learned about software development in general, and developer testing in particular, is that the flexibility of software makes it easy to design and offer a huge variety of options, configurations, behaviors, etc. When it comes to managing and testing all the possible combinations, you must take advantage of automation. Computers created the problem and you need them to address it.

Law Of Failure

A successful developer testing program must take into account that good tests fail.

This last law is a bit strange, but I had to include it because a successful developer testing program will result in a lot of test cases. When combined with frequent code changes/additions, as well as frequent builds and test executions, you will be faced with at least a few failures on most test runs – especially if the project is in the middle of a major new development or refactoring. I have noticed that this level of testing thoroughness, and the associated level of test failures, is new and disconcerting to many people and organizations, and that it requires some adjustments.

Test failures are a very good thing. They are evidence that the tests are thorough and are doing their job of detecting bugs and changes in the code behavior which, left unaddressed, could lead to further and more serious bugs. In terms of perception and day-to-day operations, however, test failures can be seen as a major challenge in a fledgling developer testing program. They are a constant reminder that the code has bugs or that the code and the tests are out of sync in one way or another. In most cases, fixing the bugs, or resynchronizing the code and the tests should take priority over adding new functionality (because you should not be building on a shaky foundation), but this can be hard to do, especially at the beginning.

Taking care of failing tests is hard to do at first because it will seem to slow you down and make you wonder if the extra work is really worthwhile. Instead of adding, say, 10 new features a week, you are now adding just seven and spending the rest of the time tracking down those pesky unit-level bugs and keeping the tests up to date. This is another time where commitment to developer testing and confidence in its benefits is required. The apparent reduction in the velocity of feature implementation is caused by a different interpretation of what it takes to be done with that feature; if you have only implemented the code for a feature or a change you have not really done. In a successful developer testing program, a feature or change is considered done only if all of the following requirements are met:

· The feature/change is implemented;
· A thorough set of unit tests for the feature/change has been implemented;
· The unit tests for the feature/change pass; and
· The feature/change does not cause any other unit tests to fail.

At the beginning of a project, adding features without taking care of writing or maintaining tests may seem faster, but the lack of testing will inevitably come back to haunt you and slow your progress down (as I have described in Part I). It’s easy to move fast at the beginning of a project, but as the body of code grows, it becomes harder and harder to add functionality and make changes without causing other components to break. This is where your investment in developer testing starts to pay off. Any sacrifice in initial project velocity will be repaid many times over as the body of code grows. The tests will repay you not only in terms of increased software quality and stability, but also in your ability to make changes quickly and with confidence.

Conclusion

The software industry is slowly waking up, and warming up, to the idea of developer testing as a way to improve software quality and software development economics. However, there is a lack of urgency and a serious underestimation of how deep an impact developer testing will have on their projects. The attitude of most software professionals is that developer testing is a promising and interesting approach worth investigating, eventually.

The arguments I have made in the first part hopefully convinced you that the practice of developer testing should not be an optional add-on, but a core and essential part of every professional software development organization worthy of its name. To create healthy software, you need unit testing. To keep your software healthy, you need the immune system that only thorough unit tests can provide. However, as we have also seen, this is easier said than done. Many fledgling developer testing efforts started with great enthusiasm and ended up with less than stellar results, partial adoption or total abandonment. This is why regular, ongoing developer testing practices are the exception rather than the rule, and why there is a developer testing paradox in the first place. Fortunately, we have learned valuable lessons from the hard-gained experiences (including both successes and failures) of many software development organizations who have gone down the developer testing path.

The idea and practice of developer testing has had an incredibly positive effect in my team’s ability to deliver high-quality software within an aggressive schedule. There is nothing like the feeling of confidence and protection that we get by having tens of thousands of test points spread across numerous tests that are run several times a day. System and integration tests are still necessary and useful, but they cannot match the fine granularity, resolution and assurance of having specialized tests for each unit of code. I can’t even imagine going back to the days where the only way to know the status of our code was to wait for QA to run a set of integration/system tests that might run into several days or weeks. I sincerely hope that the information, ideas and experiences shared herein give you the motivation, knowledge and resolution to make developer testing an imperative part in your software organization.

Alberto Savoia is the co-founder and Vice President-Engineering of Agitar Software. He is a world renowned expert in Object Oriented Testing Methodologies. The article is copyrighted by Agitar and has been reprinted with special permission.




Added on July 30, 2007 Comment

Comments

Post a comment

Your name:

Comment: