Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-21300][docs] Update module-related documentation #15230

Merged
merged 3 commits into from
Mar 23, 2021

Conversation

LadyForest
Copy link
Contributor

@LadyForest LadyForest commented Mar 16, 2021

What is the purpose of the change

This pull request is a subtask of FLINK-21045 to update the docs on module-related changes.

Brief change log

  • Update the overview.md, add links to LOAD and UNLOAD statement, update the reserved keywords(add MODULES, the reason is [FLINK-21298][table] Support 'USE MODULES' syntax both in SQL parser, TableEnvironment and SQL CLI #15005 (comment)).
  • Add load.md and unload.md to describe LOAD MODULE and UNLOAD MODULE usage.
  • Update the use.md (add description of USE MODULES) and show.md(add description of SHOW MODULES/SHOW FULL MODULES)
  • Update the modules.md, add the description on module lifecycle, resolution order change etc.
  • Update Python doctable_envirronment.md, add links to use_modules and list_full_modules
  • Update related Chinese doc(without translation because I think it's better to make it a separate task to ease review)

Verifying this change

This change can be verified after rebuilding docs. I've updated myself to adapt FLINK-21774 Please note that at this moment, the output of SHOW [FULL] MODULES stills keeps table schema even if the output is empty, this should be updated together with FLINK-21774.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 569104d (Tue Mar 16 08:09:29 UTC 2021)

✅no warnings

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@LadyForest
Copy link
Contributor Author

I'm also fine with updating myself if #15213 gets merged firstly.

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 16, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@LadyForest
Copy link
Contributor Author

Hi @wuchong, could you mind taking a look at this PR?

@LadyForest
Copy link
Contributor Author

I'm also fine with updating myself if #15213 gets merged firstly.

done

Comment on lines 121 to 126
-- +-------------+
-- | module name |
-- +-------------+
-- | core |
-- | hive |
-- +-------------+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no -- prefix printed in SQL Client.


The following grammar gives an overview of the available syntax:
```sql
LOAD MODULE `module_name` [WITH ('key1' = 'val1', 'key2' = 'val2', ...)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we explain what's the key1 val1 properties?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, users don't need always escape the module_name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we explain what's the key1 val1 properties?

Yes, I think we'd better add a description to let users know how to design their own module and module factory impl.

[INFO] Unload module succeeded!

Flink SQL> SHOW MODULES;
-- Empty set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.


The following grammar gives an overview of the available syntax:
```sql
UNLOAD MODULE `module_name`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users don't need always escape the module_name. UNLOAD MODULE module_name is fine.

@@ -215,6 +251,11 @@ USE CATALOG catalog_name

设置当前的 catalog。所有后续命令未显式指定 catalog 的将使用此 catalog。如果指定的的 catalog 不存在,则抛出异常。默认的当前 catalog 是 `default_catalog`。

## USE MODULES
```sql
USE MODULES `module_name1`[, `module_name2`, ...]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.


展示当前环境激活的所有 module。

## SHOW FULL MODULES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to merge SHOW FULL MODUELS into SHOW MODULES, just like what we do for SHOW [USER] FUNCTIONS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LadyForest for the detailed documentation. It already looks in a good shape. I only left some minor comments.

Many thanks to @wuchong for the thorough review. I'll address your comments right now.

@@ -38,6 +38,8 @@ SHOW 语句用于列出所有的 catalog,或者列出当前 catalog 中所有
- SHOW TABLES
- SHOW VIEWS
- SHOW FUNCTIONS
- SHOW MODULES
- SHOW FULL MODULES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Comment on lines 121 to 126
-- +-------------+
-- | module name |
-- +-------------+
-- | core |
-- | hive |
-- +-------------+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.


The following grammar gives an overview of the available syntax:
```sql
LOAD MODULE `module_name` [WITH ('key1' = 'val1', 'key2' = 'val2', ...)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.


Objects provided by modules are considered part of Flink's system (built-in) objects; thus, they don't have any namespaces.
A module can be loaded, enabled, disabled and unloaded. When TableEnvironment loads a module initially, it enables the module by default. Flink supports multiple modules and keeps track of the loading order to resolve metadata. *E.g.*, when there are two functions of the same name residing in two modules, Flink always resolves the function reference to the one in the 1st enabled module.
Copy link
Member

@wuchong wuchong Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to the one in the 1st enabled module.

This sounds we always resolve function by only looking up the first module which is incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to the one in the 1st enabled module.

This sounds we always resolve function by only looking up the first module which is incorrect.

Yes, it's somewhat misleading. While this is the original statement before I made this change. I'll do the rewrite to make it more accurate.

@wuchong
Copy link
Member

wuchong commented Mar 20, 2021

Thanks @LadyForest for the detailed documentation. It already looks in a good shape. I only left some minor comments.

@LadyForest
Copy link
Contributor Author

Hi @wuchong, I updated this PR according to your suggestions. Would you mind taking another look when you're free?

Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@wuchong wuchong merged commit 15b5697 into apache:master Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants