Mining software repositories for accurate authorship
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
Code authorship information is important for analyzing software quality, performing software forensics, and improving software maintenance. However, current tools assume that the last developer to change a line of code is its author regardless of all earlier changes. This approximation loses important information. We present two new line-level authorship models to overcome this limitation. We first define the repository graph as a graph abstraction for a code repository, in which nodes are the commits and edges represent the development dependencies. Then for each line of code, structural authorship is defined as a sub graph of the repository graph recording all commits that changed the line and the development dependencies between the commits, weighted authorship is defined as a vector of author contribution weights derived from the structural authorship of the line and based on a code change measure between commits, for example, best edit distance. We have implemented our two authorship models as a new git built-in tool git-author. We evaluated git-author in an empirical study and a comparison study. In the empirical study, we ran git-author on five open source projects and found that git-author can recover more information than a current tool (git-blame) for about 10% of lines. In the comparison study, we used git-author to build a line-level model for bug prediction. We compared our line-level model with an existing file-level model. The results show that our line-level model performs consistently better than the file-level model when evaluated on our data sets produced from the Apache HTTP server project.
Details
Originalsprache | Englisch |
---|---|
Titel | IEEE International Conference on Software Maintenance, ICSM |
Seiten | 250 - 259 |
Seitenumfang | 10 |
Publikationsstatus | Veröffentlicht - 2013 |
Peer-Review-Status | Ja |
Externe IDs
Scopus | 84891681964 |
---|