skip to main content
10.1145/2988336.2988347acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems

Published: 28 November 2016 Publication History

Abstract

Shuffle grouping is a technique used by stream processing frameworks to share input load among parallel instances of stateless operators. With shuffle grouping each tuple of a stream can be assigned to any available operator instance, independently from any previous assignment. A common approach to implement shuffle grouping is to adopt a Round-Robin policy, a simple solution that fares well as long as the tuple execution time is almost the same for all the tuples. However, such an assumption rarely holds in real cases where execution time strongly depends on tuple content. As a consequence, parallel stateless operators within stream processing applications may experience unpredictable unbalance that, in the end, causes undesirable increase in tuple completion times. In this paper we propose Online Shuffle Grouping (OSG), a novel approach to shuffle grouping aimed at reducing the overall tuple completion time. OSG estimates the execution time of each tuple, enabling a proactive and online scheduling of input load to the target operator instances. Sketches are used to efficiently store the otherwise large amount of information required to schedule incoming load. We provide a probabilistic analysis and illustrate, through both simulations and a running prototype, its impact on stream processing applications.

References

[1]
L. Amini, N. Jain, A. Sehgal, J. Silber, and O. Verscheure. Adaptive control of extreme-scale stream processing systems. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems, ICDCS, 2006.
[2]
R. H. Arpaci-Dusseau, E. Anderson, N. Treuhaft, D. E. Culler, J. M. Hellerstein, D. Patterson, and K. Yelick. Cluster I/O with River: Making the Fast Case Common. In Proceedings of the 6th Workshop on Input/Output in Parallel and Distributed Systems, IOPADS, 1999.
[3]
V. Cardellini, E. Casalicchio, M. Colajanni, and P. S. Yu. The state of the art in locally distributed web-server systems. ACM Computing Surveys, 34(2), 2002.
[4]
J. L. Carter and M. N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18, 1979.
[5]
G. Cormode and S. Muthukrishnan. An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms, 55, 2005.
[6]
B. Gedik. Partitioning functions for stateful data parallelism in stream processing. The VLDB Journal, 23(4), 2014.
[7]
V. Gupta, M. Harchol-balter, K. Sigman, and W. Whitt. Analysis of join-theshortest-queue routing for web server farms. In Proceedings of the 25th IFIP WG 7.3 International Symposium on Computer Modeling, Measurement and Evaluation, PERFORMANCE, 2007.
[8]
D. Gusfield. Bound the naive multiple machine scheduling with release times deadlines. Journal of Algorithms, (5):1--6, 1984.
[9]
M. Hirzel, R. Soulé, S. Schneider, B. Gedik, and R. Grimm. A catalog of stream processing optimizations. ACM Computing Surveys, 46(4), 2014.
[10]
Linked Data Benchmark Council. Social Network Benchmark. https://ldbcouncil.org/benchmarks/snb.
[11]
A. Mukhopadhyay and R. Mazumdar. Analysis of randomized join-the-shortest-queue (jsq) schemes in large heterogeneous processor sharing systems. IEEE Transactions on Control of Network Systems, PP(99):1--1, 2015.
[12]
Muthukrishnan. Data Streams: Algorithms and Applications. Now Publishers Inc., 2005.
[13]
M. A. U. Nasir, G. D. F. Morales, D. G. Soriano, N. Kourtellis, and M. Serafini. The power of both choices: Practical load balancing for distributed stream processing engines. In Proceedings of the 31st IEEE International Conference on Data Engineering, ICDE, 2015.
[14]
N. Rivetti, L. Querzoni, E. Anceaume, Y. Busnel, and B. Sericola. Efficient key grouping for near-optimal load balancing in stream processing systems. In Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, DEBS, 2015.
[15]
M. A. Sharaf, P. K. Chrysanthis, A. Labrinidis, and K. Pruhs. Algorithms and metrics for processing multiple heterogeneous continuous queries. ACM Transactions on Database Systems, 33(1):5:1--5:44, Mar. 2008.
[16]
The Apache Software Foundation. Apache Storm. https://storm.apache.org.
[17]
S. Zhou. Performance Studies of Dynamic Load Balancing in Distributed Systems. PhD thesis, UC Berkeley, 1987.

Cited By

View all
  • (2024)FlexSP:(1 + β)-Choice based Flexible Stream Partitioning for Stateful OperatorsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673157(732-741)Online publication date: 12-Aug-2024
  • (2024)Adaptive key partitioning in distributed stream processingCCF Transactions on High Performance Computing10.1007/s42514-023-00179-36:2(164-178)Online publication date: 12-Jan-2024
  • (2023)A survey on the evolution of stream processing systemsThe VLDB Journal10.1007/s00778-023-00819-833:2(507-541)Online publication date: 22-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '16: Proceedings of the 17th International Middleware Conference
November 2016
280 pages
ISBN:9781450343008
DOI:10.1145/2988336
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 November 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data Streaming
  2. Shuffle Grouping
  3. Stream Processing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

Middleware '16
Sponsor:
  • ACM
  • USENIX Assoc

Acceptance Rates

Overall Acceptance Rate 203 of 948 submissions, 21%

Upcoming Conference

MIDDLEWARE '24
25th International Middleware Conference
December 2 - 6, 2024
Hong Kong , Hong Kong

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FlexSP:(1 + β)-Choice based Flexible Stream Partitioning for Stateful OperatorsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673157(732-741)Online publication date: 12-Aug-2024
  • (2024)Adaptive key partitioning in distributed stream processingCCF Transactions on High Performance Computing10.1007/s42514-023-00179-36:2(164-178)Online publication date: 12-Jan-2024
  • (2023)A survey on the evolution of stream processing systemsThe VLDB Journal10.1007/s00778-023-00819-833:2(507-541)Online publication date: 22-Nov-2023
  • (2022)HYPERSONIC: A Hybrid Parallelization Approach for Scalable Complex Event ProcessingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517829(1093-1107)Online publication date: 10-Jun-2022
  • (2022)POTUS: Predictive Online Tuple Scheduling for Data Stream Processing SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2020.303257710:4(2863-2875)Online publication date: 1-Oct-2022
  • (2022)Jarvis: Large-scale Server Monitoring with Adaptive Near-data Processing2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00110(1408-1422)Online publication date: May-2022
  • (2021) Hone: Mitigating Stragglers in Distributed Stream Processing With Tuple Scheduling IEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.305105932:8(2021-2034)Online publication date: 1-Aug-2021
  • (2021)Thinking More about RDMA Memory Semantics2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00033(456-467)Online publication date: Sep-2021
  • (2021)Byzantine-tolerant uniform node sampling service in large-scale networksInternational Journal of Parallel, Emergent and Distributed Systems10.1080/17445760.2021.193987336:5(412-439)Online publication date: 20-Jun-2021
  • (2019)A Comprehensive Survey on Parallelization and Elasticity in Stream ProcessingACM Computing Surveys10.1145/330384952:2(1-37)Online publication date: 30-Apr-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media