Mining software repositories for accurate authorship

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Code authorship information is important for analyzing software quality, performing software forensics, and improving software maintenance. However, current tools assume that the last developer to change a line of code is its author regardless of all earlier changes. This approximation loses important information. We present two new line-level authorship models to overcome this limitation. We first define the repository graph as a graph abstraction for a code repository, in which nodes are the commits and edges represent the development dependencies. Then for each line of code, structural authorship is defined as a sub graph of the repository graph recording all commits that changed the line and the development dependencies between the commits, weighted authorship is defined as a vector of author contribution weights derived from the structural authorship of the line and based on a code change measure between commits, for example, best edit distance. We have implemented our two authorship models as a new git built-in tool git-author. We evaluated git-author in an empirical study and a comparison study. In the empirical study, we ran git-author on five open source projects and found that git-author can recover more information than a current tool (git-blame) for about 10% of lines. In the comparison study, we used git-author to build a line-level model for bug prediction. We compared our line-level model with an existing file-level model. The results show that our line-level model performs consistently better than the file-level model when evaluated on our data sets produced from the Apache HTTP server project.

Details

OriginalspracheEnglisch
TitelIEEE International Conference on Software Maintenance, ICSM
Seiten 250 - 259
Seitenumfang10
PublikationsstatusVeröffentlicht - 2013
Peer-Review-StatusJa

Externe IDs

Scopus 84891681964