Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-21485][sql-client] Simplify the ExecutionContext #15006

Merged
merged 4 commits into from
Mar 12, 2021

Conversation

fsk119
Copy link
Member

@fsk119 fsk119 commented Feb 24, 2021

What is the purpose of the change

This is pr is mainly to refactor the ExecutionContext and SessionContext. We think Executor is a service to execute the commands from different session, SessionContext maintains the users configuration and ExecutionContext is the wrapper of the TableEnvironment. When execute SET or RESET commands, the executor will request the SessionContext to modify the config and rebuild a new TableEnvironment. However, sometimes we need to rollback the origin configuration. Therefore, we introduce a new context named DefaultContext to keep the command line option, YAML environment.

Brief change log

  • Introduce the DefaultContext
  • Refactor ExecutionContext

Verifying this change

This change is already covered by existing tests, such as (please describe tests).

  • Introduce the SessionContextTest to verify the set and reset.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit f6a9d79 (Wed Feb 24 13:59:50 UTC 2021)

Warnings:

  • 1 pom.xml files were touched: Check for build and licensing issues.
  • This pull request references an unassigned Jira ticket. According to the code contribution guide, tickets need to be assigned before starting with the implementation work.

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 24, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@fsk119 fsk119 force-pushed the FLINK-21485 branch 2 times, most recently from 4cd5563 to 4cc39b3 Compare February 25, 2021 03:48
@fsk119 fsk119 changed the title [Flink-21485][sql-client] Simplify the ExecutionContext [FLINK-21485][sql-client] Simplify the ExecutionContext Feb 25, 2021
@fsk119 fsk119 force-pushed the FLINK-21485 branch 4 times, most recently from 840b4cf to f5721f1 Compare March 4, 2021 12:55
Copy link
Contributor

@godfreyhe godfreyhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @fsk119 , I left some comments.

import static org.apache.flink.table.api.Expressions.$;

/** Utils to initialize {@link TableEnvironment} from {@link Environment}. */
public class LegacyTableEnvironmentInitializer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will we introduce TableEnvironmentInitializer for ddl ? if that, it's better we could introduce an interface for LegacyTableEnvironmentInitializer and TableEnvironmentInitializer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will not introduce TableEnvironmentInitializer for ddl right now. When useing sql file to init, it will just work like put commands in the terminal.

Comment on lines +103 to +116
for (String key : sessionConfiguration.toMap().keySet()) {
// Don't care the type of the option
ConfigOption<String> keyToDelete = ConfigOptions.key(key).stringType().noDefaultValue();
sessionConfiguration.removeConfig(keyToDelete);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just re-initialize the sessionConfiguration based on defaultContext ?

sessionConfiguration = new Configuration(defaultContext.getFlinkConfig())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reasons is the CatalogManager, FunctionManager use sessionConfiguration . If we rebuild a new Configuration, we can 't modify the values of the option for CatalogManager, FunctionManager .

}

/** Create a new {@link ExecutionContext} without initialization. */
public ExecutionContext(Environment environment, ExecutionContext context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to remove this method, because the above method is more generic and the initializeTableEnvironmentFromEnvironment method will be called in SessionContext#create.
Actually, we can rename initializeTableEnvironmentFromEnvironment to initializeSessionState, because TableEnvironment instance will change with ExecutionContext

Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fsk119 , I left some comments.

* multiple queries as long as the session context does not change. This must be thread-safe as it
* might be reused across different query submissions.
*/
public class ExecutionContext {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering ExecutionContext is also not needed anymore. The only thing in ExecutionContext been used by other components is the classloader and the tableEnv. However, IMO, classloader is a more session variable which should be get from SessionContext (e.g. we may support ADD JAR to add jar dependency to the session in the future). Then ExecutionContext would be just a wrapper of TableEnvironment.

If this is the case, what do you think about createing a new TableEnvironment instead of ExecutionContext in SessionContext.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO for future work.

sessionContext.reset();

Assert.assertEquals(
"100",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the influence scope of reset ? Should it reset all configuration to default value, or rollback to initial value defined in sql-client-defaults.yaml or init.sql?

If it reset to default value, then the expected value of max-table-result-rows should be 1000000.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After offline discussion,

  • reset should reset value to the value set in the yaml.
  • it should reset to the default value if yaml doesn't set the value.
  • it should clear the value if it doesn't have default value.

@fsk119 fsk119 force-pushed the FLINK-21485 branch 2 times, most recently from 64b620a to 83446f3 Compare March 11, 2021 04:18
// TODO: use `table.planner` and `execution.runtime-mode` to configure the TableEnvironment
config.addConfiguration(settings.toConfiguration());

if (environment.getExecution().isStreamingPlanner()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the new introduced table.planner and execution.mode option doesn't take effect here. Currently, the implementation only depends on the SQL Client configuration.

I think we should use table.planner and execution.mode to decide how to create TableEnv.

Besides, please add tests for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. But I think we can leave the work to FLINK-21462. It will do the work about Configuration and Yaml compatibility.

Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fsk119
Copy link
Member Author

fsk119 commented Mar 12, 2021

The failed test is because of the network and it passes the test in the test before.

@wuchong wuchong merged commit c998ba4 into apache:master Mar 12, 2021
@fsk119 fsk119 deleted the FLINK-21485 branch November 22, 2021 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants