Study of Optimized Window Aggregate Function for Big Data Analytics

Author(s): Shailender Kumar*, Preetam Kumar, Aman Mittal.

Journal Name: Recent Patents on Engineering

Volume 13 , Issue 2 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Background: A Window Aggregate function belongs to a class of functions, which have emerged as a very important tool for Big Data Analytics. They lend support in analysis and decisionmaking applications. A window aggregate function aggregates and returns the result by applying the function over a limited number of tuples corresponding to current tuple and hence lending support for big data analytics. We have gone through different patents related to window aggregate functions and its optimization. The cost associated with Big data analytics, especially the processing of window functions is one of the major limiting factors. However, now a number of optimizing techniques have evolved for both single as well as multiple window aggregate functions.

Methods: In this paper, the authors have discussed various optimization techniques and summarized the latest techniques that have been developed over a period through intensive research in this area. The paper tried to compare various techniques based on certain parameters like the degree of parallelism, multiple window function support, execution time etc.

Results: After analyzing all these techniques, segment tree data structure seems better technique as it outperforms other techniques on different grounds like efficiency, memory overhead, execution speed and degree of parallelism.

Conclusion: In order to optimize the window aggregate function, segment tree data structure technique is a better technique, which can certainly improve the processing of window aggregate function specifically in big data analytics.

Keywords: Window aggregate function, optimization, parallel processing, query processing, big data analytics, segment tree data structure.

M. George, S. Wong, and A-K. Ezzat, Aggregate function partitions for distributed processing.U.S. Patent 20120191699A1,2012,
Q. Cheng, L. Liu, W. Ma, M. Pirahesh, and C. Zuzarte, System and method for transforming queries using window aggregation. U.S. Patent 20040153448A1, 2004.
S. Bellamkonda, R. Ahmed, A. Witkowski, A. Amor, M. Zait, and C.C. Lin, "Enhanced subquery optimizations in oracle", Proc. VLDB Endow, vol. vol. 2, pp. 1366-1377. 2009
Y. Cao, R. Bramandia, C-Y. Chan, and K.L. Tan, "Optimized query evaluation using cooperative sorts", In 26th International Conference on Data Engineering (ICDE 2010).Long Beach, CA, USA2010,
Y. Cao, C. Chan, J. Li, and K. Tan, "Optimization of analytic window function", Proc. VLDB Endow, vol. vol. 5, pp. 1244-1255. 2012
V. Leis, K. Kundhikanjana, A. Kemper, and T. Neumann, "Efficient processing of window functions in analytical SQL queries", Proc. VLDB Endow, vol. vol. 8, pp. 1058-1069 2015
V. Harinarayan, A. Rajaraman, and J.D. Ullman, "Implementing data cubes efficiently", In Proceedings of the 1996 ACM SIGMOD international conference on Management of data. Montreal, QB, Canada, 1996, pp. 205-216.
P. Boncz, M. Zukowski, and N. Nes, "MonetDB/X100: hyper-pipelining query execution", In Proceedings of the 2005 CIDR Conference.Asilomar, CA, 2005, pp. 225-237.
J. Ma, Y. Cao, X. Wang, C. Wang, and A. Zhou, "PGWinFunc: optimizing window aggregate functions in postgresql and its application for trajectory data", In IEEE 31st International Conference on Data Engineering, Seoul, South Korea, , 2015, pp. 1-4.
Y. Cao, R. Bramandia, C-Y. Chan, and K.L. Tan, "Sort-sharing-aware query processing", The VLDB J.. vol. 21, pp. 441-436, 2012
A. Kemper, and T. Neumann, "HyPer: a hybrid OLTP & OLAP main memory database system based on virtual memory snapshots", In IEEE 27th International Conference on Data Engineering, Hannover, Germany, 2011pp. 195-206
S. Agarwal, R. Agarwal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi, "On the Computation of Multidimensional Aggregates", In VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases, Mumbai, India, . 1996, pp. 506-521
Z. Chen, and V. Narasayya, "Efficient computation of multiple group by queries", Proceedings of the 2005 ACM SIGMOD international conference on Management of data.Baltimore, Maryland, USA 2005, pp. 263-274.
S. Bellamkonda, T. Bozkaya, B. Ghosh, A. Gupta, J. Haydu, S. Subramanian, and A. Witkowski, Analytic functions in oracle 8i.Technical report., Oracle Corporation: Redwood Shores, CA, 2000.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [101 - 107]
Pages: 7
DOI: 10.2174/1872212112666180330162741
Price: $58

Article Metrics

PDF: 28
PRC: 2