Publication: Feature Distillation of High Dimension Datasets: Dimension Contraction and Component Dilation
Submitted Date
Received Date
Accepted Date
Issued Date
2020
Copyright Date
Announcement No.
Application No.
Patent No.
Valid Date
Resource Type
Edition
Resource Version
Language
en
File Type
No. of Pages/File Size
ISBN
ISSN
eISSN
Scopus ID
WOS ID
Pubmed ID
arXiv ID
item.page.harrt.identifier.callno
Other identifier(s)
Journal Title
2020 17th International Joint Conference on Computer Science and Software Engineering (JCSSE)
Volume
Issue
Edition
Start Page
18
End Page
23
Access Rights
Access Status
Rights
Rights Holder(s)
Physical Location
Bibliographic Citation
Research Projects
Organizational Units
Authors
Journal Issue
Title
Feature Distillation of High Dimension Datasets: Dimension Contraction and Component Dilation
Alternative Title(s)
Author’s Affiliation
Author's E-mail
Editor(s)
Editor’s Affiliation
Corresponding person(s)
Creator(s)
Compiler
Advisor(s)
Illustrator(s)
Applicant(s)
Inventor(s)
Issuer
Assignee
Other Contributor(s)
Series
Has Part
Abstract
Handling high dimensional datasets is always a challenge in machine learning. The operations that deal with datasets containing vast amounts of features are extremely time-consuming and computationally expensive. In an even more challenging scenario; a data scientist may be given a dataset without any information about the target feature and he/she needs to find out the features that can be possibly predicted from the dataset without loss of interpretation. To satisfy this requirement we introduce a new technique; “Feature Distillation”. In addition to that a new algorithm; Dimension Contraction and Component Dilation (DCCD) is proposed. The idea is to return features that can be predicted with high accuracies from a dataset using a combination of information measurement; discretization; dimensionality reduction; supervised and unsupervised techniques. The DCCD algorithm is tested with various synthetic datasets and proven to be faster than traditional methods.