-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Make jni debug workspace configurable #6228
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/apache/incubator-gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
@@ -75,14 +75,14 @@ private static JniWorkspace createDefault() { | |||
} | |||
} | |||
|
|||
public static void enableDebug() { | |||
public static void enableDebug(String debugDir) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if debugDir
changed by user in session?
We either handle the case or just raise an error (by making the option static or adding assertion in code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yohahaha Let's still add an assertion to make sure the function doesn't get call twice with different debugDir
values. In case the method is used in test code or somewhere. Thanks.
That may involve adding another global variable but it could be worth it I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you think DEBUG_INSTANCE
will be initialized twice in one process? if yes, we should refactor jni initialization first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you think DEBUG_INSTANCE will be initialized twice in one process
Nope. Just a protection in this method assuming unpredictable caller code. E.g., One could call enableDebug(...)
for test purpose. We'd throw something otherwise the second call will be silently swallowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this functionality should only be called once by design from my understanding, if someone want call it twice, I think they dont know that they did. We dont really need protect anything.
Related conf already be static and enableDebug
has ensure only initialize DEBUG_INSTANCE
once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they dont know that they did. We dont really need protect anything.
I think anti cases are appearing easier than we think it could be.
For example, if someone in test code, calling JniWorkspace.enableDebug("folder1")
for debug purpose, however in somewhere a Spark config spark.gluten.sql.debug.keepJniWorkspaceDir=folder2
was already set for some reason.
Then he could go mad because the files were not found in folder1 and the console doesn't tell him anything about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a debug conf, I feel that we don't need to provide excessive protection for the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enableDebug("folder1")
should just enable debug in folder folder1
. If there's potential that it doesn't use folder1
, please give user/caller a note.
I don't think it's excessive protection. Next time please close comments like this one before merging a PR.
val DEBUG_KEEP_JNI_WORKSPACE_DIR = | ||
buildConf(GLUTEN_DEBUG_KEEP_JNI_WORKSPACE_DIR) | ||
.internal() | ||
.stringConf | ||
.createWithDefault("/tmp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make the conf static, as well as DEBUG_KEEP_JNI_WORKSPACE
. Thanks.
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for adding this configuration, it is very helpful for troubleshooting.
===== Performance report for TPCDS SF2000 with Velox backend, for reference only ====
|
What changes were proposed in this pull request?
Add new config
spark.gluten.sql.debug.keepJniWorkspaceDir
to store libraries, and decouple the functionality fromspark.gluten.sql.debug