Publication: Enhancing the performance of association rule models by filtering instances in colorectal cancer patients
Submitted Date
Received Date
Accepted Date
Issued Date
2017
Copyright Date
Announcement No.
Application No.
Patent No.
Valid Date
Resource Type
Edition
Resource Version
Language
en
File Type
No. of Pages/File Size
ISBN
ISSN
2539-6218
eISSN
DOI
Scopus ID
WOS ID
Pubmed ID
arXiv ID
item.page.harrt.identifier.callno
Other identifier(s)
Journal Title
Engineering and Applied Science Research
Volume
44
Issue
2
Edition
Start Page
76
End Page
83
Access Rights
Access Status
Rights
Copyright (c) 2017 Engineering and Applied Science Research
Rights Holder(s)
Physical Location
Bibliographic Citation
Research Projects
Organizational Units
Authors
Journal Issue
Title
Enhancing the performance of association rule models by filtering instances in colorectal cancer patients
Alternative Title(s)
Author(s)
Author’s Affiliation
Author's E-mail
Editor(s)
Editor’s Affiliation
Corresponding person(s)
Creator(s)
Compiler
Advisor(s)
Illustrator(s)
Applicant(s)
Inventor(s)
Issuer
Assignee
Other Contributor(s)
Series
Has Part
Abstract
Colorectal cancer data available from the SEER program is analyzed with the aim of using filtering techniques to improve the performance of association rule models. In this paper; it is proposed to improve the quality of the dataset by removing its outliers using the Hidden Naïve Bayes (HNB); Naïve Bayes Tree (NBTree) and Reduced Error Pruning Decision Tree (REPTree) algorithms. The Apriori and HotSpot algorithms are applied to mine the association rules between the 13 selected attributes and average survivals. Experimental results show that the HNB algorithm can improve the accuracy of the Apriori algorithm’s performance by up to 100% and support threshold up to 45%. It can also improve the accuracy of the HotSpot algorithm’s performance up to 93.38% and support threshold up to 80%. Therefore; the HotSpot rules with minimum support of 80% are selected for explanation. The HotSpot algorithm shows that colorectal cancer patients; who died from colon cancer and were not receiving radiation therapy; were associated with survival of less than 22 months. Our study shows that filtering techniques in the preprocessing stage are a useful approach in enhancing the quality of the data set. This finding could help researchers build models for better prediction and performance analysis. Although it is heuristic; such analysis can be very useful to identify the factors affecting survival. It can also aid medical practitioners in helping patients to understand risks involved in a particular treatment procedure.