After a distinguished career at UNL, Katherine Stolee defended her dissertation successfully, and will be joining Iowa State University as a faculty member in the Fall 2013. Best of luck Katie!
The abstract of her dissertation on “Solving the Search for Source Code” follows:
Programmers frequently search for source code to reuse using keyword searches. When effective and efficient, a code search can boost programmer productivity, however,
the search effectiveness depends on the programmer’s ability to specify a query that captures how the desired code may have been implemented. Further, the results often
include many irrelevant matches that must be filtered manually. More semantic search approaches could address these limitations, yet existing approaches either do not scale, are not flexible enough to find approximate matches, or require complex specifications.
We propose a novel approach to semantic search that addresses some of these limitations and is designed for queries that can be described using an example. In
this approach, programmers write lightweight specifications as inputs and expected output examples for the behavior of desired code. Using these specifications, an SMT
solver identifies source code from a repository that matches the specifications. The repository is composed of program snippets encoded as constraints that approximate
the semantics of the code.
This research contributes the first work toward using SMT solvers to search for existing source code. In this dissertation, we motivate the study of code search and the
utility of a more semantic approach to code search. We introduce and illustrate the generality of our approach using subsets of three languages, Java, Yahoo! Pipes, and
SQL. Our approach is implemented in a tool, Satsy, for Yahoo! Pipes and Java. The evaluation covers various aspects of the approach, and the results indicate that this
approach is effective at finding relevant code. Even with a small repository, our search is competitive with state-of-the-practice syntactic searches when searching for Java
code. Further, this approach is flexible and can be used on its own, or in conjunction with a syntactic search. Finally, we show that this approach is adaptable to finding
approximate matches when exact matches do not exist, and that programmers are capable of composing input/output queries with reasonable speed and accuracy. These
results are promising and lead to several open research questions that we are only beginning to explore.