skip to main content
survey

A Comprehensive Survey on Parallelization and Elasticity in Stream Processing

Published: 30 April 2019 Publication History

Abstract

Stream Processing (SP) has evolved as the leading paradigm to process and gain value from the high volume of streaming data produced, e.g., in the domain of the Internet of Things. An SP system is a middleware that deploys a network of operators between data sources, such as sensors, and the consuming applications. SP systems typically face intense and highly dynamic data streams. Parallelization and elasticity enable SP systems to process these streams with continuous high quality of service. The current research landscape provides a broad spectrum of methods for parallelization and elasticity in SP. Each method makes specific assumptions and focuses on particular aspects. However, the literature lacks a comprehensive overview and categorization of the state of the art in SP parallelization and elasticity, which is necessary to consolidate the state of the research and to plan future research directions on this basis. Therefore, in this survey, we study the literature and develop a classification of current methods for both parallelization and elasticity in SP systems.

References

[1]
Daniel J. Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: A new model and architecture for data stream management. VLDB J. 12, 2 (Aug. 2003), 120--139.
[2]
Asaf Adi and Opher Etzion. 2004. Amit—The situation manager. VLDB J. 13, 2 (May 2004), 177--203.
[3]
Tyler Akidau, Alex Balikov, Kaya Bekiroğlu, Slava Chernyak, Josh Haberman, Reuven Lax, Sam McVeety, Daniel Mills, Paul Nordstrom, and Sam Whittle. 2013. MillWheel: Fault-tolerant stream processing at internet scale. Proc. VLDB Endow. 6, 11 (Aug. 2013), 1033--1044.
[4]
Elias Alevizos, Alexander Artikis, and George Paliouras. 2017. Event forecasting with pattern Markov chains. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS’17). ACM, New York, NY, 146--157.
[5]
L. Amini, N. Jain, A. Sehgal, J. Silber, and O. Verscheure. 2006. Adaptive control of extreme-scale stream-processing systems. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS’06). 71--71.
[6]
H. Andrade, B. Gedik, K.-L. Wu, and P.S. Yu. 2011. Processing high data rate streams in System S. J. Parallel Distrib. Comput. 71, 2 (Feb. 2011), 145--156.
[7]
Leonardo Aniello, Roberto Baldoni, and Leonardo Querzoni. 2013. Adaptive online scheduling in storm. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems (DEBS’13). ACM, New York, NY, 207--218.
[8]
Arvind Arasu, Shivnath Babu, and Jennifer Widom. 2006. The CQL continuous query language: Semantic foundations and query execution. VLDB J. 15, 2 (June 2006), 121--142.
[9]
Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A view of cloud computing. Commun. ACM 53, 4 (Apr. 2010), 50--58.
[10]
Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. 1999. Balanced allocations. SIAM J. Comput. 29, 1 (Sept. 1999), 180--200.
[11]
Nathan Backman, Rodrigo Fonseca, and Uǧur Çetintemel. 2012. Managing parallelism for stream processing in the cloud. In Proceedings of the 1st International Workshop on Hot Topics in Cloud Data Processing (HotCDP’12). ACM, New York, NY, 1:1--1:5.
[12]
Cagri Balkesen, Nihal Dindar, Matthias Wetter, and Nesime Tatbul. 2013. Rip: Run-based intra-query parallelism for scalable complex event processing. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems. ACM, 3--14. https://dl.acm.org/citation.cfm?id=2488257
[13]
Cagri Balkesen and Nesime Tatbul. 2011. Scalable data partitioning techniques for parallel sliding window processing over data streams. In Proceedings of the International Workshop on Data Management for Sensor Networks (DMSN’11). Retrieved from https://www.softnet.tuc.gr/dmsn11/papers/paper03.pdf.
[14]
Cagri Balkesen, Nesime Tatbul, and M. Tamer Özsu. 2013. Adaptive input admission and management for parallel stream processing. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems. ACM, 15--26.
[15]
P. Basanta-Val, N. Fernández-García, L. Sánchez-Fernández, and J. Arias-Fisteus. 2017. Patterns for distributed real-time stream processing. IEEE Trans. Parallel Distrib. Syst. 28, 11 (Nov. 2017), 3243--3257.
[16]
Flavio Bonomi, Rodolfo Milito, Jiang Zhu, and Sateesh Addepalli. 2012. Fog computing and its role in the internet of things. In Proceedings of the 1st Edition of the MCC Workshop on Mobile Cloud Computing (MCC’12). ACM, New York, NY, 13--16.
[17]
Michael Borkowski, Christoph Hochreiner, and Stefan Schulte. 2018. Moderated resource elasticity for stream processing applications. In Proceedings of the Parallel Processing Workshops. Lecture Notes in Computer Science (Euro-Par’17), Dora B. Heras et al. (Ed.), Vol. 10569. Springer, Cham, 5--16.
[18]
Lars Brenna, Johannes Gehrke, Mingsheng Hong, and Dag Johansen. 2009. Distributed event stream processing with non-deterministic finite automata. In Proceedings of the 3rd ACM International Conference on Distributed Event-Based Systems (DEBS’09). ACM, New York, NY, 3:1--3:12.
[19]
A. Brito, A. Martin, T. Knauth, S. Creutz, D. Becker, S. Weigert, and C. Fetzer. 2011. Scalable and low-latency data processing with stream MapReduce. In Proceedings of the IEEE 3rd International Conference on Cloud Computing Technology and Science. 48--58.
[20]
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38 (2015), 28--38.
[21]
Valeria Cardellini, Vincenzo Grassi, Francesco Lo Presti, and Matteo Nardelli. 2016. Optimal operator placement for distributed stream processing applications. In Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems (DEBS’16). ACM, New York, NY, 69--80.
[22]
Valeria Cardellini, Vincenzo Grassi, Francesco Lo Presti, and Matteo Nardelli. 2017. Optimal operator replication and placement for distributed stream processing systems. SIGMETRICS Perform. Eval. Rev. 44, 4 (May 2017), 11--22.
[23]
Valeria Cardellini, Francesco Lo Presti, Matteo Nardelli, and Gabriele Russo Russo. 2018. Decentralized self-adaptation for elastic data stream processing. Future Generation Computer Systems 87 (Oct. 2018), 171--185.
[24]
Sharma Chakravarthy and Deepak Mishra. 1994. Snoop: An expressive event specification language for active databases. Data Knowl. Eng. 14, 1 (1994), 1--26. Retrieved from https://www.sciencedirect.com/science/article/pii/0169023X9490006X.
[25]
Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel R. Madden, Fred Reiss, and Mehul A. Shah. 2003. TelegraphCQ: Continuous dataflow processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’03). ACM, New York, NY, 668--668.
[26]
Mitch Cherniack, Hari Balakrishnan, Magdalena Balazinska, Donald Carney, Ugur Cetintemel, Ying Xing, and Stanley B. Zdonik. 2003. Scalable distributed stream processing. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’03), Vol. 3. 257--268. Retrieved from https://nms.csail.mit.edu/papers/CIDR_CRC.pdf.
[27]
Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Elmeleegy, and Russell Sears. 2010. MapReduce online. In Proceedings of the USENIX Conference on Networked Systems Design and Implementation (NSDI’10), Vol. 10. 20. Retrieved from https://static.usenix.org/events/nsdi10/tech/full_papers/condie.pdf.
[28]
Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein, John Gerth, Justin Talbot, Khaled Elmeleegy, and Russell Sears. 2010. Online aggregation and continuous query support in MapReduce. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, NY, 1115--1118.
[29]
Gianpaolo Cugola and Alessandro Margara. 2010. TESLA: A formally defined event specification language. In Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems. ACM, 50--61. Retrieved from https://dl.acm.org/citation.cfm?id=1827427.
[30]
Gianpaolo Cugola and Alessandro Margara. 2012. Low latency complex event processing on parallel hardware. J. Parallel Distrib. Comput. 72, 2 (Feb. 2012), 205--218.
[31]
Gianpaolo Cugola and Alessandro Margara. 2012. Processing flows of information: From data stream to complex event processing. Comput. Surveys 44, 3 (June 2012), 1--62.
[32]
Marco Danelutto, Peter Kilpatrick, Gabriele Mencagli, and Massimo Torquati. 0. State access patterns in stream parallel computations. Int. J. High Performance Comput. Appl. 0, 0 (0), 1--12. arXiv:https://doi.org/10.1177/1094342017694134
[33]
Marcos Dias de Assuncao, Alexandre da Silva Veith, and Rajkumar Buyya. 2017. Resource elasticity for distributed data stream processing: A survey and future directions. arXiv preprint arXiv:1709.01363 (2017).
[34]
Tiziano De Matteis and Gabriele Mencagli. 2016. Keep calm and react with foresight: Strategies for low-latency and energy-efficient elastic data stream processing. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’16). ACM, New York, NY.
[35]
Tiziano De Matteis and Gabriele Mencagli. 2017. Elastic scaling for distributed latency-sensitive data stream operators. In Proceedings of the 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP’17). 61--68.
[36]
Tiziano De Matteis and Gabriele Mencagli. 2017. Parallel patterns for window-based stateful operators on data streams: An algorithmic skeleton approach. Int. J. Parallel Program. 45, 2 (Apr. 2017), 382--401.
[37]
Tiziano De Matteis and Gabriele Mencagli. 2017. Proactive elasticity and energy awareness in data stream processing. Journal of Systems and Software 127 (2017), 302--319.
[38]
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Conference on Operating Systems Design 8 Implementation—Volume 6 (OSDI’04). USENIX Association, Berkeley, CA, 10--10. https://dl.acm.org/citation.cfm?id=1251254.1251264
[39]
Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, Walker M. White, et al. 2007. Cayuga: A general purpose event monitoring system. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’07), Vol. 7. 412--422. Retrieved from https://www.ccis.northeastern.edu/home/mirek/papers/2007-CIDR-CayugaImp.pdf.
[40]
Y. Drougas and V. Kalogeraki. 2009. Accommodating bursts in distributed stream processing systems. In Proceedings of the IEEE International Symposium on Parallel Distributed Processing. 1--11.
[41]
Esper. 2019. Esper. Retrieved from https://www.espertech.com/.
[42]
Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. 2003. The many faces of publish/subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114--131.
[43]
R. C. Fernandez, P. Garefalakis, and P. Pietzuch. 2016. Java2SDG: Stateful big data processing for the masses. In Proceedings of the IEEE 32nd International Conference on Data Engineering (ICDE’16). 1390--1393.
[44]
Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2013. Integrating scale out and fault tolerance in stream processing using operator state management. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’13). ACM, New York, NY, 725--736.
[45]
Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2014. Making state explicit for imperative big data processing. In Proceedings of the USENIX Annual Technical Conference (USENIX-ATC’14). USENIX Association, Berkeley, CA, 49--60. https://dl.acm.org/citation.cfm?id=2643634.2643640.
[46]
L. Fischer and A. Bernstein. 2015. Workload scheduling in distributed stream processors using graph partitioning. In Proceedings of the IEEE International Conference on Big Data (BigData’15). 124--133.
[47]
Ioannis Flouris, Nikos Giatrakos, Antonios Deligiannakis, Minos Garofalakis, Michael Kamp, and Michael Mock. 2017. Issues in complex event processing: Status and prospects in the Big Data era. J. Syst. Softw. 127, Supplement C (2017), 217--236.
[48]
Apache Software Foundation. 2015. Apache Storm Project Website. Retrieved from https://storm.apache.org.
[49]
The Apache Software Foundation. 2019. Apache Flink Project Website. Retrieved from https://flink.apache.org/.
[50]
Buğra Gedik. 2014. Generic windowing support for extensible stream processing systems. Software: Pract. Exper. 44, 9 (2014), 1105--1128.
[51]
Buğra Gedik. 2014. Partitioning functions for stateful data parallelism in stream processing. VLDB J. 23, 4 (Aug. 2014), 517--539.
[52]
B. Gedik, S. Schneider, M. Hirzel, and K. L. Wu. 2014. Elastic scaling for data stream processing. IEEE Trans. Parallel Distrib. Syst. 25, 6 (June 2014), 1447--1463.
[53]
B. Gedik, H. G. Özsema, and Ö. Öztürk. 2016. Pipelined fission for stream programs with dynamic selectivity and partitioned state. J. Parallel Distrib. Comput. 96 (2016), 106--120.
[54]
Michael I. Gordon, William Thies, and Saman Amarasinghe. 2006. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XII). ACM, New York, NY, 151--162.
[55]
Michael I. Gordon, William Thies, Michal Karczmarek, Jasper Lin, Ali S. Meli, Andrew A. Lamb, Chris Leger, Jeremy Wong, Henry Hoffmann, David Maze, and Saman Amarasinghe. 2002. A stream compiler for communication-exposed architectures. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS X). ACM, New York, NY, 291--303.
[56]
Michael Grossniklaus, David Maier, James Miller, Sharmadha Moorthy, and Kristin Tufte. 2016. Frames: Data-driven Windows. In Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems (DEBS’16). ACM, New York, NY, 13--24.
[57]
V. Gulisano, R. Jiménez-Peris, M. Patiño-Martínez, C. Soriente, and P. Valduriez. 2012. StreamCloud: An elastic and scalable data streaming system. IEEE Trans. Parallel Distrib. Syst. 23, 12 (Dec. 2012), 2351--2365.
[58]
Thomas Heinze, Leonardo Aniello, Leonardo Querzoni, and Zbigniew Jerzak. 2014. Cloud-based data stream processing. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems (DEBS’14). ACM, New York, NY, 238--245.
[59]
Thomas Heinze, Zbigniew Jerzak, Gregor Hackenbroich, and Christof Fetzer. 2014. Latency-aware elastic scaling for distributed data stream processing systems. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems. ACM, 13--22.
[60]
Thomas Heinze, Yuanzhen Ji, Yinying Pan, Franz Josef Grueneberger, Zbigniew Jerzak, and Christof Fetzer. 2013. Elastic complex event processing under varying query load. In Proceedings of the 1st International Workshop on Big Dynamic Distributed Data (BD3’13). 25.
[61]
Thomas Heinze, Valerio Pappalardo, Zbigniew Jerzak, and Christof Fetzer. 2014. Auto-scaling techniques for elastic data stream processing. In Proceedings of the IEEE 30th International Conference on Data Engineering Workshops (ICDEW’14). IEEE, 296--302.
[62]
Thomas Heinze, Lars Roediger, Andreas Meister, Yuanzhen Ji, Zbigniew Jerzak, and Christof Fetzer. 2015. Online parameter optimization for elastic data stream processing. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15). ACM, New York, NY, 276--287.
[63]
Nicolas Hidalgo, Daniel Wladdimiro, and Erika Rosas. 2017. Self-adaptive processing graph with operator fission for elastic stream processing. J. Syst. Softw. 127 (2017), 205--216.
[64]
Martin Hirzel. 2012. Partition and compose: Parallel complex event processing. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (DEBS’12). ACM, New York, NY, 191--200.
[65]
Martin Hirzel, Scott Schneider, and Bugra Gedik. 2014. SPL: An Extensible Language for Distributed Stream Processing. Technical Report. Research Report RC25486, IBM. Retrieved from https://hirzels.com/martin/papers/tr14-rc25486-spl.pdf.
[66]
Martin Hirzel, Scott Schneider, and Kanat Tangwongsan. 2017. Sliding-window aggregation algorithms: Tutorial. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS’17). ACM, New York, NY, 11--14.
[67]
Martin Hirzel, Robert Soulé, Scott Schneider, Buğra Gedik, and Robert Grimm. 2014. A catalog of stream processing optimizations. ACM Comput. Surv. 46, 4 (Mar. 2014), 46:1--46:34.
[68]
C. Hochreiner, M. Vogler, P. Waibel, and S. Dustdar. 2016. VISP: An ecosystem for elastic data stream processing for the internet of things. In Proceedings of the IEEE 20th International Enterprise Distributed Object Computing Conference.1--11.
[69]
C. Hochreiner, M. Vögler, S. Schulte, and S. Dustdar. 2016. Elastic stream processing for the internet of things. In Proceedings of the IEEE 9th International Conference on Cloud Computing (CLOUD’16). 100--107.
[70]
Waldemar Hummer, Benjamin Satzger, and Schahram Dustdar. 2013. Elastic stream processing in the cloud. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 3, 5 (Sept. 2013), 333--345.
[71]
E. Kalyvianaki, W. Wiesemann, Q. H. Vu, D. Kuhn, and P. Pietzuch. 2011. SQPR: Stream query planning with reuse. In Proceedings of the IEEE 27th International Conference on Data Engineering. 840--851.
[72]
David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC’97). ACM, New York, NY, 654--663.
[73]
Nikos R. Katsipoulakis, Alexandros Labrinidis, and Panos K. Chrysanthis. 2017. A holistic view of stream partitioning costs. Proc. VLDB Endow. 10, 11 (Aug. 2017), 1286--1297.
[74]
J. O. Kephart and D. M. Chess. 2003. The vision of autonomic computing. Computer 36, 1 (Jan. 2003), 41--50.
[75]
Rohit Khandekar, Kirsten Hildrum, Sujay Parekh, Deepak Rajan, Joel Wolf, Kun-Lung Wu, Henrique Andrade, and Bugra Gedik. 2009. COLA: Optimizing stream processing applications via graph partitioning. In Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware (Middleware’09). Springer-Verlag New York, Inc., New York, NY. https://dl.acm.org/citation.cfm?id=1656980.1657002
[76]
J. F. C. Kingman. 1961. The single server queue in heavy traffic. In Mathematical Proceedings of the Cambridge Philosophical Society, Vol. 57. Cambridge University Press, 902--904.
[77]
Thomas Kohler, Ruben Mayer, Frank Dürr, Marius Maaß, Sukanya Bhowmik, and Kurt Rothermel. 2018. P4CEP: Towards in-network complex event processing. In Proceedings of the 2018 Morning Workshop on In-Network Computing (NetCompute’18). ACM, New York, NY, 33--38.
[78]
Alexandros Koliousis, Matthias Weidlich, Raul Castro Fernandez, Alexander L. Wolf, Paolo Costa, and Peter Pietzuch. 2016. SABER: Window-based hybrid stream processing for heterogeneous architectures. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16). ACM, New York, NY, 555--569.
[79]
Roland Kotto Kombi, Nicolas Lumineau, and Philippe Lamarre. 2017. A preventive auto-parallelization approach for elastic stream processing. In Proceedings of the IEEE 37th International Conference on Distributed Computing Systems (ICDCS’17). IEEE, 1532--1542.
[80]
Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel, Karthik Ramasamy, and Siddarth Taneja. 2015. Twitter Heron: Stream processing at scale. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’15). ACM, New York, NY, 239--250.
[81]
A. Kumbhare et al. 2015. Fault-tolerant and elastic streaming MapReduce with decentralized coordination. In Proceedings of the IEEE 35th International Conference on Distributed Computing Systems. 328--338.
[82]
A. G. Kumbhare, Y. Simmhan, and V. K. Prasanna. 2014. PLAStiCC: Predictive look-ahead scheduling for continuous dataflows on clouds. In Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 344--353.
[83]
A. G. Kumbhare, Y. Simmhan, and V. K. Prasanna. 2014. PLAStiCC: Predictive look-ahead scheduling for continuous dataflows on clouds. In Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’14), Vol. 00. 344--353.
[84]
Wang Lam, Lu Liu, Sts Prasad, Anand Rajaraman, Zoheb Vacheri, and AnHai Doan. 2012. Muppet: MapReduce-style processing of fast data. Proc. VLDB Endow. 5, 12 (Aug. 2012), 1814--1825.
[85]
Danh Le-Phuoc, Hoan Nguyen Mau Quoc, Chan Le Van, and Manfred Hauswirth. 2013. Elastic and scalable processing of linked stream data in the cloud. In Proceedings of the 12th International Semantic Web Conference—Part I (ISWC’13). Springer-Verlag, New York, NY, 280--297.
[86]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. No pane, no gain: Efficient evaluation of sliding-window aggregates over data streams. ACM SIGMOD Rec. 34, 1 (2005), 39--44. https://dl.acm.org/citation.cfm?id=1058158
[87]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. Semantics and evaluation techniques for window aggregates in data streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’05). ACM, New York, NY, 311--322.
[88]
Xunyun Liu and Rajkumar Buyya. 2017. D-storm: Dynamic resource-efficient scheduling of stream processing applications. In Procedings of the IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS’17). IEEE, 485--492.
[89]
Xunyun Liu and Rajkumar Buyya. 2017. Performance-oriented deployment of streaming applications on cloud. IEEE Trans. Big Data 4, 1 (2017), 46--51.
[90]
B. Lohrmann, P. Janacik, and O. Kao. 2015. Elastic stream processing with latency guarantees. In Proceedings of the IEEE 35th International Conference on Distributed Computing Systems. 399--410.
[91]
Björn Lohrmann, Daniel Warneke, and Odej Kao. 2014. Nephele streaming: Stream processing under QoS constraints at scale. Cluster Comput. 17, 1 (Mar. 2014), 61--78.
[92]
Federico Lombardi, Leonardo Aniello, Silvia Bonomi, and Leonardo Querzoni. 2017. Elastic symbiotic scaling of operators and resources in stream processing systems. IEEE Trans. Parallel Distrib. Syst. 29, 3 (2017), 572--583.
[93]
Kasper Grud Skat Madsen and Yongluan Zhou. 2015. Dynamic resource management in a massively parallel stream processing engine. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, New York, NY, 13--22.
[94]
K. G. S. Madsen, Y. Zhou, and J. Cao. 2017. Integrative dynamic reconfiguration in a parallel stream processing engine. In Proceedings of the IEEE 33rd International Conference on Data Engineering (ICDE’17). IEEE, 227--230.
[95]
C. Mayer, M. A. Tariq, R. Mayer, and K. Rothermel. 2018. GrapH: Traffic-aware graph processing. IEEE Trans. Parallel Distrib. Syst. 99 (2018), 1--1.
[96]
Ruben Mayer, Boris Koldehofe, and Kurt Rothermel. 2015. Predictable low-latency event detection with parallel complex event processing. IEEE Internet Things J. 2, 4 (Aug. 2015), 274--286.
[97]
Ruben Mayer, Christian Mayer, Muhammad Adnan Tariq, and Kurt Rothermel. 2016. GraphCEP: Real-time data analytics using parallel complex event and graph processing. In Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems (DEBS’16). ACM, New York, NY, 309--316.
[98]
Ruben Mayer, Ahmad Slo, Muhammad Adnan Tariq, Kurt Rothermel, Manuel Gräber, and Umakishore Ramachandran. 2017. SPECTRE: Supporting consumption policies in window-based parallel complex event processing. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware’17). ACM, New York, NY, 161--173.
[99]
Ruben Mayer, Muhammad Adnan Tariq, and Kurt Rothermel. 2017. Minimizing communication overhead in window-based parallel complex event processing. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS’17). ACM, New York, NY, 54--65.
[100]
Yuan Mei and Samuel Madden. 2009. ZStream: A cost-based query processor for adaptively detecting composite events. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD’09). ACM, New York, NY, 193--206.
[101]
Gabriele Mencagli. 2016. Adaptive model predictive control of autonomic distributed parallel computations with variable horizons and switching costs. Concurr. Comput.: Pract. Exper. 28, 7 (2016), 2187--2212.
[102]
Gabriele Mencagli. 2016. A game-theoretic approach for elastic distributed data stream processing. ACM Trans. Auton. Adapt. Syst. 11, 2, Article 13 (June 2016).
[103]
Gabriele Mencagli, Massimo Torquati, and Marco Danelutto. 2018. Elastic-PPQ: A two-level autonomic system for spatial preference query processing over dynamic data streams. Future Gen. Comput. Syst. 79 (2018), 862--877.
[104]
G. Mencagli, M. Torquati, M. Danelutto, and T. De Matteis. 2017. Parallel continuous preference queries over out-of-order and bursty data streams. IEEE Trans. Parallel Distrib. Syst. 28, 9 (Sept. 2017), 2608--2624.
[105]
Gabriele Mencagli, Massimo Torquati, Fabio Lucattini, Salvatore Cuomo, and Marco Aldinucci. 2018. Harnessing sliding-window execution semantics for parallel stream processing. J. Parallel Distrib. Comput. 116 (2018), 74--88.
[106]
Gabriele Mencagli, Marco Vanneschi, and Emanuele Vespa. 2014. A cooperative predictive control approach to improve the reconfiguration stability of adaptive distributed parallel applications. ACM Trans. Auton. Adapt. Syst. 9, 1, Article 2 (Mar. 2014).
[107]
Y. Nakamura, H. Suwa, Y. Arakawa, H. Yamaguchi, and K. Yasumoto. 2016. Design and implementation of middleware for IoT devices toward real-time flow processing. In Proceedings of the IEEE 36th International Conference on Distributed Computing Systems Workshops (ICDCSW’16). 162--167.
[108]
M. A. U. Nasir, G. De Francisci Morales, D. García-Soriano, N. Kourtellis, and M. Serafini. 2015. The power of both choices: Practical load balancing for distributed stream processing engines. In Proceedings of the IEEE 31st International Conference on Data Engineering. 137--148.
[109]
M. A. U. Nasir, G. D. F. Morales, N. Kourtellis, and M. Serafini. 2016. When two choices are not enough: Balancing at scale in Distributed Stream Processing. In Proceedings of the IEEE 32nd International Conference on Data Engineering (ICDE’16). 589--600.
[110]
L. Neumeyer, B. Robbins, A. Nair, and A. Kesari. 2010. S4: Distributed stream computing platform. In Proceedings of the IEEE International Conference on Data Mining Workshops. 170--177.
[111]
Beate Ottenwälder, Boris Koldehofe, Kurt Rothermel, Kirak Hong, and Umakishore Ramachandran. 2014. RECEP: Selection-based reuse for distributed complex event processing. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems (DEBS’14). ACM, New York, NY, 59--70.
[112]
Beate Ottenwälder, Boris Koldehofe, Kurt Rothermel, and Umakishore Ramachandran. 2013. MigCEP: Operator migration for mobility driven distributed complex event processing. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems (DEBS’13). ACM, New York, NY, 183--194.
[113]
P. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. Seltzer. 2006. Network-aware operator placement for stream-processing systems. In Proceedings of the 22nd International Conference on Data Engineering (ICDE’06). 49--49.
[114]
Olga Poppe, Chuan Lei, Salah Ahmed, and Elke A. Rundensteiner. 2017. Complete event trend detection in high-rate event streams. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). ACM, New York, NY, 109--124.
[115]
Do Le Quoc, Ruichuan Chen, Pramod Bhatotia, Christof Fetzer, Volker Hilt, and Thorsten Strufe. 2017. StreamApprox: Approximate computing for stream analytics. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware’17). ACM, New York, NY, 185--197.
[116]
Medhabi Ray, Chuan Lei, and Elke A. Rundensteiner. 2016. Scalable pattern sharing on event streams. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16). ACM, New York, NY, 495--510.
[117]
Nicoló Rivetti, Emmanuelle Anceaume, Yann Busnel, Leonardo Querzoni, and Bruno Sericola. 2016. Online scheduling for shuffle grouping in distributed stream processing systems. In Proceedings of the 17th International Middleware Conference (Middleware’16). ACM, New York, NY.
[118]
Nicoló Rivetti, Leonardo Querzoni, Emmanuelle Anceaume, Yann Busnel, and Bruno Sericola. 2015. Efficient key grouping for near-optimal load balancing in stream processing systems. In Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems (DEBS’15). ACM, New York, NY, 80--91.
[119]
S. Rizou, F. Dürr, and K. Rothermel. 2010. Solving the multi-operator placement problem in large-scale operator networks. In 2010 Proceedings of the 19th International Conference on Computer Communications and Networks. 1--6.
[120]
Omran Saleh, Heiko Betz, and Kai-Uwe Sattler. 2015. Partitioning for scalable complex event processing on data streams. In New Trends in Database and Information Systems II. Springer, Cham, 185--197. https://link.springer.com/chapter/10.1007/978-3-319-10518-5_15
[121]
Omran Saleh and Kai-Uwe Sattler. 2015. The pipeflow approach: Write once, run in different stream-processing engines. In Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems (DEBS’15). ACM, New York, NY, 368--371.
[122]
B. Satzger, W. Hummer, P. Leitner, and S. Dustdar. 2011. Esc: Towards an elastic stream computing platform for the cloud. In Proceedings of the IEEE 4th Conf.International Conference on Cloud Computing. 348--355.
[123]
S. Schneider, H. Andrade, B. Gedik, A. Biem, and K. Wu. 2009. Elastic scaling of data parallel operators in stream processing. In Proceedings of the IEEE International Symposium on Parallel Distributed Processing. 1--12.
[124]
Scott Schneider, Martin Hirzel, Bugra Gedik, and Kun-Lung Wu. 2012. Auto-parallelizing stateful distributed streaming applications. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT’12). ACM, New York, NY, 53--64.
[125]
S. Schneider, M. Hirzel, B. Gedik, and K. L. Wu. 2015. Safe data parallelism for general streaming. IEEE Trans. Comput. 64, 2 (Feb. 2015), 504--517.
[126]
Scott Schneider, Joel Wolf, Kirsten Hildrum, Rohit Khandekar, and Kun-Lung Wu. 2016. Dynamic load balancing for ordered data-parallel regions in distributed streaming systems. In Proceedings of the 17th International Middleware Conference (Middleware’16). ACM, New York, NY, 21:1--21:14.
[127]
M. A. Shah, J. M. Hellerstein, Sirish Chandrasekaran, and M. J. Franklin. 2003. Flux: An adaptive partitioning operator for continuous query systems. In Proceedings of the 19th International Conference on Data Engineering (Cat. No. 03CH37405). 25--36.
[128]
Anatoli Shein, Panos Chrysanthis, and Alexandros Labrinidis. 2018. SlickDeque: High Throughput and Low Latency Incremental Sliding-Window Aggregation. In Proceedings of the int. conf. on Extending Data Base Technology (EDBT), Vienna, Austria, March 26--29 (2018), 397--408, openprocessdings.org.
[129]
S. Shevtsov, M. Berekmeri, D. Weyns, and M. Maggio. 2018. Control-theoretical software adaptation: A systematic literature review. IEEE Trans. Softw. Eng. 44, 8 (Aug. 2018), 784--810.
[130]
A. Shukla and Y. Simmhan. 2018. Toward reliable and rapid elasticity for streaming dataflows on clouds. In Proceedings of the IEEE 38th International Conference on Distributed Computing Systems (ICDCS’18). 1096--1106.
[131]
Dawei Sun, Guangyan Zhang, Songlin Yang, Weimin Zheng, Samee U. Khan, and Keqin Li. 2015. Re-stream: Real-time and energy-efficient resource scheduling in big data stream computing environments. Info. Sci. 319 (2015), 92--112.
[132]
SystemS 2019. IBM System S Project website. Retrieved from https://researcher.watson.ibm.com/researcher/view_group_subpage.php?id=2534/.
[133]
Yuzhe Tang and Bugra Gedik. 2013. Autopipelining for data stream processing. IEEE Trans. Parallel Distrib. Syst. 24, 12 (2013), 2344--2354.
[134]
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2017. Low-latency sliding-window aggregation in worst-case constant time. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS’17). ACM, New York, NY, 66--77.
[135]
Kanat Tangwongsan, Martin Hirzel, Scott Schneider, and Kun-Lung Wu. 2015. General incremental sliding-window aggregation. Proc. VLDB Endow. 8, 7 (Feb. 2015), 702--713.
[136]
William Thies, Michal Karczmarek, and Saman Amarasinghe. 2002. StreamIt: A language for streaming applications. In Compiler Construction. Springer, Berlin, 179--196.
[137]
J. Urbani, A. Margara, C. Jacobs, S. Voulgaris, and H. Bal. 2014. AJIRA: A lightweight distributed middleware for mapreduce and stream processing. In Proceedings of the IEEE 34th International Conference on Distributed Computing Systems. 545--554.
[138]
J. S. v. d. Veen, B. v. d. Waaij, E. Lazovik, W. Wijbrandi, and R. J. Meijer. 2015. Dynamically scaling apache storm for the analysis of streaming data. In Proceedings of the IEEE First International Conference on Big Data Computing Service and Applications. 154--161.
[139]
Uri Verner, Assaf Schuster, and Mark Silberstein. 2011. Processing data streams with hard real-time constraints on heterogeneous systems. In Proceedings of the International Conference on Supercomputing (ICS’11). ACM, New York, NY, 120--129.
[140]
Y. H. Wang, K. Cao, and X. M. Zhang. 2013. Complex event processing over distributed probabilistic event streams. Comput. Math. Appl. 66, 10 (Dec. 2013), 1808--1821.
[141]
D. Warneke and O. Kao. 2011. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud. IEEE Trans. Parallel Distrib. Syst. 22, 6 (June 2011), 985--997.
[142]
Thomas Weigold, Marco Aldinucci, Marco Danelutto, and Vladimir Getov. 2012. Process-driven biometric identification by means of autonomic grid components. Int. J. Auton. Adapt. Commun. Syst. 5, 3 (July 2012), 274--291.
[143]
Joel Wolf, Nikhil Bansal, Kirsten Hildrum, Sujay Parekh, Deepak Rajan, Rohit Wagle, Kun-Lung Wu, and Lisa Fleischer. 2008. SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware (Middleware’08). Springer-Verlag New York, Inc., New York, NY, 306--325. https://dl.acm.org/citation.cfm?id=1496950.1496970
[144]
Louis Woods, Jens Teubner, and Gustavo Alonso. 2010. Complex event detection at wire speed with FPGAs. Proc. VLDB Endow. 3, 1-2 (Sept. 2010), 660--669.
[145]
Eugene Wu, Yanlei Diao, and Shariq Rizvi. 2006. High-performance complex event processing over streams. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD’06). ACM, New York, NY, 407--418.
[146]
Sai Wu, Vibhore Kumar, Kun-Lung Wu, and Beng Chin Ooi. 2012. Parallelizing stateful operators in a distributed stream processing system: How, should you and how much? In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems. ACM, 278--289. https://dl.acm.org/citation.cfm?id=2335515.
[147]
Y. Wu and K. L. Tan. 2015. ChronoStream: Elastic stateful stream computation in the cloud. In Proceedings of the IEEE 31st International Conference on Data Engineering. 723--734.
[148]
Ying Xing, Jeong-Hyon Hwang, Uǧur Çetintemel, and Stan Zdonik. 2006. Providing resiliency to load variations in distributed stream processing. In Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB’06). VLDB Endowment, 775--786. https://dl.acm.org/citation.cfm?id=1182635.1164194
[149]
Jielong Xu, Zhenhua Chen, Jian Tang, and Sen Su. 2014. T-storm: Traffic-aware online scheduling in storm. In Proceedings of the IEEE 34th International Conference on Distributed Computing Systems. 535--544.
[150]
L. Xu, B. Peng, and I. Gupta. 2016. Stela: Enabling stream processing systems to scale-in and scale-out on-demand. In Proceedings of the IEEE International Conference on Cloud Engineering (IC2E’16). 22--31.
[151]
Keiichi Yasumoto, Hirozumi Yamaguchi, and Hiroshi Shigeno. 2016. Survey of real-time processing technologies of iot data streams. J. Info. Process. 24, 2 (2016), 195--202.
[152]
Nikos Zacheilas, Vana Kalogeraki, Nikolas Zygouras, Nikolaos Panagiotou, and Dimitrios Gunopulos. 2015. Elastic complex event processing exploiting prediction. In Proceedings of the IEEE International Conference on Big Data (BigData’15). IEEE, 213--222.
[153]
Nikos Zacheilas, Nikolas Zygouras, Nikolaos Panagiotou, Vana Kalogeraki, and Dimitrios Gunopulos. 2016. Dynamic load balancing techniques for distributed complex event processing systems. In Distributed Applications and Interoperable Systems. Springer, Cham, 174--188. Retrieved from https://link.springer.com/chapter/10.1007/978-3-319-39577-7_14.
[154]
Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, NY, 423--438.
[155]
E. Zeitler and Tore Risch. 2011. Massive scale-out of expensive continuous queries. In Proceedings of the VLDB Endowment, Vol. 4, No. 11.
[156]
Liang Zheng, Carlee Joe-Wong, Chee Wei Tan, Mung Chiang, and Xinyu Wang. 2015. How to bid the cloud. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM’15). ACM, New York, NY, 71--84.
[157]
D. Zimmer and R. Unland. 1999. On the semantics of complex events in active database management systems. In Proceedings of the 15th International Conference on Data Engineering. 392--399.
[158]
Nikolas Zygouras, Nikos Zacheilas, Vana Kalogeraki, Dermot Kinane, and Dimitrios Gunopulos. 2015. In Proceedings of the International Conference on Extending Database Technology (EDBT’15). Retrieved from OpenProceedings.org.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 52, Issue 2
March 2020
770 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3320149
  • Editor:
  • Sartaj Sahni
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2019
Accepted: 01 January 2019
Revised: 01 December 2018
Received: 01 February 2018
Published in CSUR Volume 52, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Stream processing
  2. complex event processing
  3. data stream management system
  4. elasticity
  5. parallelization

Qualifiers

  • Survey
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)151
  • Downloads (Last 6 weeks)20
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)In-Network Management of Parallel Data Streams over Programmable Data Planes2024 IFIP Networking Conference (IFIP Networking)10.23919/IFIPNetworking62109.2024.10619812(50-58)Online publication date: 3-Jun-2024
  • (2024)FlexSP:(1 + β)-Choice based Flexible Stream Partitioning for Stateful OperatorsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673157(732-741)Online publication date: 12-Aug-2024
  • (2024)DecoPa: Query Decomposition for Parallel Complex Event ProcessingProceedings of the ACM on Management of Data10.1145/36549352:3(1-26)Online publication date: 30-May-2024
  • (2024)StreamBed: Capacity Planning for Stream ProcessingProceedings of the 18th ACM International Conference on Distributed and Event-based Systems10.1145/3629104.3666034(90-102)Online publication date: 24-Jun-2024
  • (2024) Rule based complex event processing for IoT applications: Review, classification and challenges Expert Systems10.1111/exsy.13597Online publication date: 30-Mar-2024
  • (2024)Nona: A Framework for Elastic Stream Provenance2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00071(703-714)Online publication date: 23-Jul-2024
  • (2024)To Migrate or Not to Migrate: An Analysis of Operator Migration in Distributed Stream ProcessingIEEE Communications Surveys & Tutorials10.1109/COMST.2023.333095326:1(670-705)Online publication date: 1-Jan-2024
  • (2024)A computational framework based on the dynamic pipeline approachJournal of Logical and Algebraic Methods in Programming10.1016/j.jlamp.2024.100966139(100966)Online publication date: Jun-2024
  • (2024)Orchestrating scheduling, grouping and parallelism to enhance the performance of distributed stream computing systemExpert Systems with Applications10.1016/j.eswa.2024.124346(124346)Online publication date: Jun-2024
  • (2024)Boosting general-purpose stream processing with reconfigurable hardwareThe Journal of Supercomputing10.1007/s11227-024-05894-480:9(13048-13078)Online publication date: 1-Jun-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media