SSBSE 2016 Supplementary Data

In recent research on software product line engineering, sev- eral frameworks have been developed to reverse engineer feature models using a genetic algorithm. In these frameworks the starting point for the search is either a set of Boolean equations or a set of valid products representing the software product line. The most successful fitness func- tions aim to maximize the number of matched products while reducing the number of missing and/or additional products. To calculate this fit- ness, each valid product defined by the model is enumerated using a SAT solver. In attempting to reproduce the results of these frameworks, we have discovered that once the size of the feature models increase beyond 27 features, the enumeration step becomes a bottleneck and the search either runs out of memory or times out. In this paper we propose a new fitness function, SATff, that simulates validity by computing the tauto- logical implication of the sets of constraints in each model against the original, estimating the distance between the two. In an empirical study on 101 feature models we evaluate the quality of the new fitness function compared with two existing fitness functions that use the enumeration technique. We see that SATff shows a significant improvement over one of these across all models, and no significant difference with the other one, suggesting a similar effectiveness. However, SATff requires only 7% of the runtime on average, scaling to feature models with as many as 97 features.

We used 101 subjects for our experiments, categorized into several groups as bellows.

This research was supported in part by the National Science Foundation award CCF-1161767. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or the Air Force Office of Scientific Research.