copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Clone detection using abstract syntax trees

I. Baxter, A. Yahin, L. Moura, M. Sant'Anna, and L. Bier. Software Maintenance, 1998. Proceedings., International Conference on, page 368-377. (November 1998)
DOI: 10.1109/ICSM.1998.738528

Abstract

Existing research suggests that a considerable fraction (5-10%) of the source code of large scale computer programs is duplicate code (“clones”). Detection and removal of such clones promises decreased software maintenance costs of possibly the same magnitude. Previous work was limited to detection of either near misses differing only in single lexems, or near misses only between complete functions. The paper presents simple and practical methods for detecting exact and near miss clones over arbitrary program fragments in program source code by using abstract syntax trees. Previous work also did not suggest practical means for removing detected clones. Since our methods operate in terms of the program structure, clones could be removed by mechanical methods producing in-lined procedures or standard preprocessor macros. A tool using these techniques is applied to a C production software system of some 400 K source lines, and the results confirm detected levels of duplication found by previous work. The tool produces macro bodies needed for clone removal, and macro invocations to replace the clones. The tool uses a variation of the well known compiler method for detecting common sub expressions. This method determines exact tree matches; a number of adjustments are needed to detect equivalent statement sequences, commutative operands, and nearly exact matches. We additionally suggest that clone detection could also be useful in producing more structured code, and in reverse engineering to discover domain concepts and their implementations

Description

IEEE Xplore Abstract - Clone detection using abstract syntax trees

Links and resources

BibTeX key: baxter1998clone
entry type: inproceedings
booktitle: Software Maintenance, 1998. Proceedings., International Conference on
year: 1998
month: nov
pages: 368-377
issn: 1063-6773
DOI: 10.1109/ICSM.1998.738528
url: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=738528&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D738528

@s_nkeha's tags highlighted

Cite this publication

%0 Conference Paper %1 baxter1998clone %A Baxter, I.D. %A Yahin, A. %A Moura, L. %A Sant'Anna, M. %A Bier, L. %B Software Maintenance, 1998. Proceedings., International Conference on %D 1998 %K 2013 abstract clone detection graph syntax trees %P 368-377 %R 10.1109/ICSM.1998.738528 %T Clone detection using abstract syntax trees %U http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=738528&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D738528 %X Existing research suggests that a considerable fraction (5-10%) of the source code of large scale computer programs is duplicate code (“clones”). Detection and removal of such clones promises decreased software maintenance costs of possibly the same magnitude. Previous work was limited to detection of either near misses differing only in single lexems, or near misses only between complete functions. The paper presents simple and practical methods for detecting exact and near miss clones over arbitrary program fragments in program source code by using abstract syntax trees. Previous work also did not suggest practical means for removing detected clones. Since our methods operate in terms of the program structure, clones could be removed by mechanical methods producing in-lined procedures or standard preprocessor macros. A tool using these techniques is applied to a C production software system of some 400 K source lines, and the results confirm detected levels of duplication found by previous work. The tool produces macro bodies needed for clone removal, and macro invocations to replace the clones. The tool uses a variation of the well known compiler method for detecting common sub expressions. This method determines exact tree matches; a number of adjustments are needed to detect equivalent statement sequences, commutative operands, and nearly exact matches. We additionally suggest that clone detection could also be useful in producing more structured code, and in reverse engineering to discover domain concepts and their implementations

@inproceedings{baxter1998clone, abstract = {Existing research suggests that a considerable fraction (5-10%) of the source code of large scale computer programs is duplicate code (“clones”). Detection and removal of such clones promises decreased software maintenance costs of possibly the same magnitude. Previous work was limited to detection of either near misses differing only in single lexems, or near misses only between complete functions. The paper presents simple and practical methods for detecting exact and near miss clones over arbitrary program fragments in program source code by using abstract syntax trees. Previous work also did not suggest practical means for removing detected clones. Since our methods operate in terms of the program structure, clones could be removed by mechanical methods producing in-lined procedures or standard preprocessor macros. A tool using these techniques is applied to a C production software system of some 400 K source lines, and the results confirm detected levels of duplication found by previous work. The tool produces macro bodies needed for clone removal, and macro invocations to replace the clones. The tool uses a variation of the well known compiler method for detecting common sub expressions. This method determines exact tree matches; a number of adjustments are needed to detect equivalent statement sequences, commutative operands, and nearly exact matches. We additionally suggest that clone detection could also be useful in producing more structured code, and in reverse engineering to discover domain concepts and their implementations}, added-at = {2014-02-28T13:44:08.000+0100}, author = {Baxter, I.D. and Yahin, A. and Moura, L. and Sant'Anna, M. and Bier, L.}, biburl = {https://www.bibsonomy.org/bibtex/2a4dd42d530e00608f8c370ba287e8aa9/s_nkeha}, booktitle = {Software Maintenance, 1998. Proceedings., International Conference on}, description = {IEEE Xplore Abstract - Clone detection using abstract syntax trees}, doi = {10.1109/ICSM.1998.738528}, interhash = {2fc4a3b69c8f263d095cbed17e37e29e}, intrahash = {a4dd42d530e00608f8c370ba287e8aa9}, issn = {1063-6773}, keywords = {2013 abstract clone detection graph syntax trees}, month = nov, pages = {368-377}, timestamp = {2014-02-28T13:44:08.000+0100}, title = {Clone detection using abstract syntax trees}, url = {http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=738528&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D738528}, year = 1998 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Clone detection using abstract syntax trees

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Clone detection using abstract syntax trees

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Clone detection using abstract syntax trees

Comments and Reviews
(0)