wanghongjun
Research Associate
Supervisor of Master's Candidates
- Master Tutor
- Education Level:PhD graduate
- Degree:Doctor of engineering
- Business Address:犀浦3号教学楼31529
- Professional Title:Research Associate
- Alma Mater:四川大学
- Supervisor of Master's Candidates
- School/Department:计算机与人工智能学院
- Discipline:Electronic Information
Software Engineering
Computer Application Technology
Contact Information
- PostalAddress:
- Email:
- Paper Publications
Hyper-ellipsoidal clustering technique for evolving data stream
- Impact Factor:8.139
- DOI number:10.1016/j.knosys.2013.11.022
- Affiliation of Author(s):西南交通大学
- Journal:KNOWLEDGE-BASED SYSTEMS
- Place of Publication:NETHERLANDS
- Key Words:Data mining Decision support systems Hyper-ellipsoidal clustering Evolving data stream Data clustering
- Abstract:Data mining has become a key ingredient in establishing intelligent decision support systems. As one of main branches in data mining, data stream clustering has received much attention over the past decade. Most existing data stream clustering techniques count on Euclidean distance metric for finding similar objects and hence produce spherical clusters which are not always suitable to represent the data. Moreover, in most of the real world problems, we come across the data of varying density which cannot be handled by density-based clustering techniques. In this paper, we introduce a new clustering technique called Hyper-Ellipsoidal Clustering for Evolving data Stream (HECES) based on the recently proposed HyCARCE algorithm. In HECES, a few modifications in the HyCARCE algorithm are made for handling stream clustering problem: sliding window model is used to handle incoming stream of data to minimize the impact of the obsolete information on recent clustering results; shrinkage technique is used to avoid the singularity issue in finding the covariance of correlated data; a novel technique for merging the initial ellipsoids is used to obtain the final clusters instead of a computationally intensive process of expansion and adjustment. HECES relies on Mahalanobis distance metric to cluster the data points and hence results in ellipsoidal shaped clusters. It can successfully handle data of varying density. Experiments on various synthetic and real datasets for clustering streaming data provide a comparative validation of our approach.
- Co-author:Yan Yang, Hongjun Wang
- First Author:Muhammad Zia-urRehman
- Indexed by:Academic papers
- Correspondence Author:Tianrui Li
- Document Code:20143600059596
- Discipline:Engineering
- First-Level Discipline:Computer Science and Technology
- Volume:Volume 70
- Issue:November 2014
- Page Number:Pages 3-14
- ISSN No.:0950-7051
- Translation or Not:no
- Date of Publication:2013-11-08