Scaling a Dataflow Testing Methodology to the Multiparadigm World of Commercial Spreadsheets
M. Fisher II, G. Rothermel, T. Creelan, and M. Burnett
Technical Report TR-UNL-CSE-2005-0003
Department of Computer Science and Engineering
University of Nebraska -- Lincoln
May, 2005.


Spreadsheet languages are widely used by end users to perform a broad range of important tasks. Evidence shows, however, that spreadsheets often contain faults. Thus, in prior work we presented a dataflow testing methodology for use with spreadsheets, that provides feedback about the coverage of cells in spreadsheets via visual devices. Studies have shown that this methodology, which we call WYSIWYT (What You See Is What You Test), can be used cost-effectively by end-user programmers. To date, however, the methodology has been investigated across a limited set of spreadsheet language features. Commercial spreadsheet environments are multiparadigm languages, utilizing features often associated with dataflow, functional, imperative, and database query languages, and these features are not accommodated by prior approaches. In addition, most spreadsheets contain large numbers of replicated formulas that differ only in the cells they reference, and these severely limit the efficiency of dataflow testing approaches. We show how to handle these two aspects of commercial spreadsheet environments through a new dataflow adequacy criteria and automated detection of areas of replicated formulas. We report results of a controlled experiment investigating several factors important to the feasibility of our approach.