Title:Optimization of Multimedia Data Clustering Method Under Spark Cloud Computing Platform
VOLUME: 9 ISSUE: 2
Author(s):Yingjun Tan, Cuixia Li and Madison Murphy
Affiliation:Department of Information Engineering, Henan Polytechnic College, 518172, Zhengzhou, China.
Keywords:Clustering, data, multimedia, spark cloud computing platform.
Abstract:The multimedia data under Spark cloud computing platform are composed of
many different types of entities, and all kinds of entity attributes are not exactly same. The
traditional multimedia data clustering process based on spectral clustering algorithm
assumes that multimedia data are composed of independent entities of the same type, and
entities are unrelated with the obtained clustering results giving significant error. A
multimedia data clustering method based on semi supervised K-means is proposed, and the
K-means clustering algorithm is introduced, on the basis of it, the optimal initial clustering
center is determined by the iterative method of graph theory. According to ideology similar
to tree clustering algorithm, clustering center is expanded based on the maximum distance θ
tree clustering algorithm the similarity is computed between two data objects and between
object and cluster, and the clustering of multimedia data is realized under Spark cloud
platform. The simulation results show that the proposed method has better speedup ratio, expansion rate and
clustering accuracy, and can be applied to the multimedia data clustering under Spark cloud computing platform.