Dou et al., 2023 - Google Patents

TurBO: A cost-efficient configuration-based auto-tuning approach for cluster-based big data frameworks

Dou et al., 2023

Document ID
7283273284763743357
Author
Dou H
Zhang L
Zhang Y
Chen P
Zheng Z
Publication year
Publication venue
Journal of Parallel and Distributed Computing

External Links

Snippet

Big data processing frameworks such as Spark usually provide a large number of performance-related configuration parameters, how to auto-tune these parameters for a better performance has been a hot issue in academia as well as industry for years. Through …
Continue reading at www.sciencedirect.com (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30289Database design, administration or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
    • G06Q10/063Operations research or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks

Similar Documents

Publication Publication Date Title
Salloum et al. Big data analytics on Apache Spark
Marathe et al. Performance modeling under resource constraints using deep transfer learning
Siegmund et al. SPL Conqueror: Toward optimization of non-functional properties in software product lines
Chen et al. Temporal dependency-based checkpoint selection for dynamic verification of temporal constraints in scientific workflow systems
Cheng et al. Efficient performance prediction for apache spark
US10146531B2 (en) Method and apparatus for generating a refactored code
Bei et al. Configuring in-memory cluster computing using random forest
Laptev et al. Very fast estimation for result and accuracy of big data analytics: The EARL system
Prats et al. You only run once: spark auto-tuning from a single run
Kaur et al. An empirical study of software entropy based bug prediction using machine learning
Fekry et al. Tuneful: An online significance-aware configuration tuner for big data analytics
WO2021130040A1 (en) Controlling a quantum computing device based on predicted operation time
US11829842B2 (en) Enhanced quantum circuit execution in a quantum service
Li et al. Towards general and efficient online tuning for spark
Dou et al. TurBO: A cost-efficient configuration-based auto-tuning approach for cluster-based big data frameworks
Rodrigues et al. Screening hardware and volume factors in distributed machine learning algorithms on spark: A design of experiments (doe) based approach
Qin et al. Auxiliary Gibbs Sampling for Inference in Piecewise-Constant Conditional Intensity Models.
Rejitha et al. Energy prediction of CUDA application instances using dynamic regression models
Hao et al. Predicting QoS of virtual machines via Bayesian network with XGboost-induced classes
Chen et al. SimCost: cost-effective resource provision prediction and recommendation for spark workloads
Sindhu et al. Workload characterization and synthesis for cloud using generative stochastic processes
Choi et al. Optimizing Numerical Weather Prediction Model Performance using Machine Learning Techniques
Sagaama et al. Automatic parameter tuning for big data pipelines with deep reinforcement learning
Sharma et al. An empirical evaluation of cross project priority prediction
Aseman-Manzar et al. Cost-Aware Resource Recommendation for DAG-Based Big Data Workflows: An Apache Spark Case Study