Publication: Feature comparison for automatic bug report classification
Submitted Date
Received Date
Accepted Date
Issued Date
2020
Copyright Date
Announcement No.
Application No.
Patent No.
Valid Date
Resource Type
Edition
Resource Version
Language
en
File Type
No. of Pages/File Size
ISBN
978-3-030-19861-9
ISSN
eISSN
Scopus ID
WOS ID
Pubmed ID
arXiv ID
item.page.harrt.identifier.callno
Other identifier(s)
Journal Title
Recent Advances in Information and Communication Technology 2019
Volume
Issue
Edition
Start Page
69
End Page
78
Access Rights
Access Status
Rights
Rights Holder(s)
Physical Location
Bibliographic Citation
Research Projects
Organizational Units
Authors
Journal Issue
Title
Feature comparison for automatic bug report classification
Alternative Title(s)
Author’s Affiliation
Author's E-mail
Editor’s Affiliation
Corresponding person(s)
Creator(s)
Compiler
Advisor(s)
Illustrator(s)
Applicant(s)
Inventor(s)
Issuer
Assignee
Other Contributor(s)
Has Part
Abstract
Nowadays; various bug tracking systems (BTS) such as Jira; Trace; and Bugzilla have been developed and proposed to gather the issues from users worldwide. This is because those issues; called bug reports; contain a significant information for software quality maintenance and improvement. However; many bug reports with poor quality might have been submitted to the BTS. In general; the reported bugs in the BTS are firstly analyzed and filtered out by bug triagers. However; with the increasing amount of bug reports in the BTS; manually classifying bug reports is a time-consuming task. To address this problem; automatically distinguishing of bugs and non-bugs is necessary. To the best of our knowledge; this task is never easy for bug reports classification because the problem of bug reports misclassification still occurs to date. The background of this problem may be arise from using inappropriate or confusing features. Therefore; this work aims to study and discover the most proper features for binary bug report classification. This study compares seven features such as unigram; bigram; camel case; unigram+bigram; unigram+camel case; bigram+ camel case; and all features together. The experimental results show that the unigram+camel case should be the most proper features for binary bug report classification; especially when using with the logistic regression algorithm. Consequently; the unigram+camel case should be the proper feature to distinguish bug reports from the non-bugs ones.