Robert Dyer Robert Dyer

Assistant Professor
School of Computing
University of Nebraska–Lincoln

EmailGitHubLinkedInCVGoogle Scholar

FourD: Do Developers Discuss Design? Revisited

Published: November 13, 2016
in Proceedings of the 2nd International Workshop on Software Analytics

Software repositories contain a variety of information that can be mined and utilized to enhance software engineering processes. Patterns stored in software repository meta-data can provide useful and informative information about different aspects of a project, particularly those that may not be obvious for developers. One such aspect is the role of software design in a project. The messages connected to each commit in the repository note not only what changes have been made to project files, but potentially if those changes have somehow manipulated the design of the software.

In this paper, a sample of commit messages from a random sample of projects on GitHub and SourceForge are manually classified as “design” or “non-design” based on a survey. The resulting data is then used to train multiple machine learning algorithms in order to determine if it is possible to predict whether or not a single commit is discussing software design. Our results show the Random Forest classifier performed best on our combined data set with a G-mean of 75.01.

Slides Preview

Download Download

Collaborators


Students



 Back to all publications