Robert Dyer Robert Dyer

Assistant Professor
School of Computing
University of Nebraska–Lincoln

EmailGitHubLinkedInCVGoogle Scholar

Performing Large-Scale Mining Studies: From Start to Finish (Tutorial)

Robert Dyer, Samuel W. Flint
Published: November 18, 2022
in Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Modern software engineering research often relies on mining open-source software repositories, to either provide motivation for their research problems and/or evaluation of the proposed approach. Mining ultra-large-scale software repositories is still a difficult task, requiring substantial expertise and access to significant hardware. Tools such as Boa can help researchers easily mine large numbers of open-source repositories. There has also recently been more of a push toward open science, with an emphasis on making replication packages available. Building such replication packages incurs additional workload for researchers. In this tutorial, we teach how to use the Boa infrastructure for mining software repository data. We leverage Boa’s VS Code IDE extension to help write and submit Boa queries, and also leverage Boa’s study template to show how researchers can more easily analyze the output from Boa and automatically produce a suitable replication package that is published on Zenodo.

Slides Preview

Download Download


 Back to all publications