Improve determinism by preserving split order #22475

gaurav8297 · 2024-06-21T17:47:48Z

Description

In this we improve query execution determinism by
preserving split order when scheduling
in pipeline execution mode.

This is essentially needed such that splits coming out of CacheSplitSource preserve the order
between coordinator and workers. This way we
prevent scheduling two splits with same
CacheSplitId to reuse cache within a query.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

starburstdata-automation · 2024-06-21T20:40:12Z

Started benchmark workflow for this PR with test type = iceberg/sf10000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: failure
Status message:

sopel39

LGTM. Let's see benchmark results

core/trino-main/src/main/java/io/trino/execution/SplitAssignment.java

starburstdata-automation · 2024-06-25T19:05:08Z

Started benchmark workflow for this PR with test type = hive/sf1000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: success
Status message: No baseline found.
Benchmark Comparison Report

In this we improve query execution determinism by preserving split order when scheduling in pipeline execution mode. This is essentially needed such that splits coming out of CacheSplitSource preserve the order between coordinator and workers. This way we prevent scheduling two splits with same CacheSplitId to reuse cache within a query.

gaurav8297 · 2024-06-26T05:43:33Z

CI issue: #18697 (comment)

starburstdata-automation · 2024-06-28T01:30:37Z

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: success
Status message: NO Regression found.
Benchmark Comparison Report

sopel39 · 2024-07-01T10:52:09Z

core/trino-main/src/main/java/io/trino/execution/SplitAssignment.java

@@ -22,6 +22,8 @@



nit: commit message references stuff not yet landed

cla-bot bot added the cla-signed label Jun 21, 2024

gaurav8297 requested a review from sopel39 June 21, 2024 17:48

gaurav8297 force-pushed the preserve_split_ordering branch from e2567d6 to fc17c6a Compare June 21, 2024 20:37

sopel39 approved these changes Jun 25, 2024

View reviewed changes

core/trino-main/src/main/java/io/trino/execution/SplitAssignment.java Outdated Show resolved Hide resolved

gaurav8297 force-pushed the preserve_split_ordering branch from fc17c6a to 5aff432 Compare June 25, 2024 19:14

gaurav8297 requested a review from sopel39 June 26, 2024 05:50

sopel39 approved these changes Jul 1, 2024

View reviewed changes

sopel39 reviewed Jul 1, 2024

View reviewed changes

sopel39 merged commit 452c627 into trinodb:master Jul 1, 2024
95 checks passed

sopel39 added the no-release-notes This pull request does not require release notes entry label Jul 1, 2024

github-actions bot added this to the 452 milestone Jul 1, 2024

colebow mentioned this pull request Jul 3, 2024

Add Trino 452 release notes #22573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve determinism by preserving split order #22475

Improve determinism by preserving split order #22475

gaurav8297 commented Jun 21, 2024 •

edited

Loading

starburstdata-automation commented Jun 21, 2024 •

edited

Loading

sopel39 left a comment

starburstdata-automation commented Jun 25, 2024 •

edited

Loading

gaurav8297 commented Jun 26, 2024

starburstdata-automation commented Jun 28, 2024 •

edited

Loading

sopel39 Jul 1, 2024

Improve determinism by preserving split order #22475

Improve determinism by preserving split order #22475

Conversation

gaurav8297 commented Jun 21, 2024 • edited Loading

Description

Additional context and related issues

Release notes

starburstdata-automation commented Jun 21, 2024 • edited Loading

sopel39 left a comment

Choose a reason for hiding this comment

starburstdata-automation commented Jun 25, 2024 • edited Loading

gaurav8297 commented Jun 26, 2024

starburstdata-automation commented Jun 28, 2024 • edited Loading

sopel39 Jul 1, 2024

Choose a reason for hiding this comment

gaurav8297 commented Jun 21, 2024 •

edited

Loading

starburstdata-automation commented Jun 21, 2024 •

edited

Loading

starburstdata-automation commented Jun 25, 2024 •

edited

Loading

starburstdata-automation commented Jun 28, 2024 •

edited

Loading