-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-9964] Setting --workerCacheMB for Streaming Pipeline #11710
Conversation
…ateCache constructor. Right now, this is hardcoded at 100MB
@@ -130,7 +133,8 @@ private static StateNamespace triggerNamespace(long start, int triggerIdx) { | |||
|
|||
@Before | |||
public void setUp() { | |||
cache = new WindmillStateCache(); | |||
options = PipelineOptionsFactory.as(DataflowWorkerHarnessOptions.class); | |||
cache = new WindmillStateCache(options.getWorkerCacheMb()); | |||
assertEquals(0, cache.getWeight()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a check to this test to make sure that the maximumWeight
of the cache is the 100 MB? (perhaps use a number different than 100 to be sure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this by adding a new Test method in WindmillStateCacheTest class. I created a new getter in the WindmillStateCache to retrieve the size of max weight on bytes, and compared it to the initial value set
ahh this is great. We've been running a similar patch in our fork forever. |
Feel free to submit patches upstream |
heh, I've been choosing my battles ;) |
These shouldn't need to be battles but more like:
|
heh, I feel bad clogging up this PR with unrelated conversations. If that process you described was how it worked in real life, that'd be great. Feel free to ping me on the ASF slack (at-steve) if you want to chat more about this. |
…er pabloem's comment
retest this please |
New changes passed ./gradlew -p runners/google-cloud-dataflow-java check on my computer |
retest this please |
1 similar comment
retest this please |
Run Java PostCommit |
1 similar comment
Run Java PostCommit |
Seems like java precommits are broken on master - but this change LGTM. I'll wait for precommits to be fixed if possible. Thanks @omarismail94 ! |
Run Java PreCommit |
2 similar comments
Run Java PreCommit |
Run Java PreCommit |
thanks @omarismail94 ! |
hm, I was just rebasing my work against this commit and realized something. I had moved the flag to This means that you can't actually set this flag from a job submission, eg:
|
R:@pabloem
Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make it allowable to change the cache value in Streaming when setting -workerCacheMB.
Passed ./gradlew -p runners/google-cloud-dataflow-java check on my computer
[1]
.maximumWeight(100000000 /* 100 MB */)
beam/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java
Line 73 in 5e659bb
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.