Regression testing is an expensive procedure performed on modified software to provide confidence that modified code behaves correctly, and that modifications have not adversely affected existing functionality. Most research on regression testing addresses one or both of two problems: how to select regression tests from an existing test suite (the test selection problem), and how to determine the portions of a modified program to retest (the coverage identification problem). Techniques for solving these problems are called selective retest techniques, because they reduce the cost of regression testing through selective reuse of tests, or selective retesting of program components.
Most selective retest techniques focus on coverage identification, and use information gained in pursuit of coverage to guide test selection. This research takes an opposite tack. We first define the regression test selection problem precisely, and investigate the theoretical difficulty of the problem. We then present a framework for comparing and evaluating test selection techniques, and apply that framework to existing methods to determine where additional work is needed. Our analysis demonstrates a need for a test selection technique that possesses several qualifications. First, the technique must be safe: it must select every test, from the original test suite, that could reveal an error in the modified program. Second, the technique must be sufficiently precise: it must identify enough unnecessary tests to offset its own expense. Third, the technique must be general: applicable to a wide class of programs and modifications, both intraprocedurally (on single procedures), and interprocedurally (on programs and subsystems). Finally, the technique must be efficient: its time and space requirements must be reasonable.
Our analysis of existing techniques and our theoretical results inspired an approach to test selection, that led to the creation of a family of safe test selection algorithms. Our algorithms provide progressively more precise test selection, at progressively greater cost and implementation complexity, as they move from most basic to most complex. The algorithms facilitate regression testing of single procedures, and testing of entire programs or subsystems. A theoretical examination of our algorithms proves that they provide the most precise, safe test selection available to date, amongst safe algorithms. The investigation also shows that the algorithms are at least as general and efficient as, and in most cases more general and efficient than, existing regression test selection algorithms -- whether safe or not.
We next discuss the implementations of our algorithms, and the empirical studies we performed to investigate the effectiveness of the algorithms in practice. These studies indicate that our algorithms can be efficient and effective in practice when applied interprocedurally. For example, one program we studied contained 50,000 lines of code, and 766 procedures, and was tested by a test suite of 1035 tests that required almost six hours to run. We ran our algorithm on five different modified releases of this program. On average, for a given modified version, the algorithm required 25 minutes to select a safe test set for that version, and reduced the set of regression tests required to test the version by over 95%: a savings of more than five hours of testing time. Our other studies support the efficacy of our algorithms for interprocedural test selection, but call into question the efficacy of test selection algorithms applied to single procedures; an important result given the fact that most research to date on selective retest algorithms has been limited to test selection for single procedures.
Finally, we examine the coverage identification problem, and present algorithms that, integrated with our test selection algorithms, assist with coverage identification.