-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Scheduler] Run scheduler #5194
Conversation
I am trying to deploy this version to a distributed environment and add a how-to-deploy document. |
receive-buffer-size = 3151796b | ||
maximum-frame-size = 3151796b | ||
remote { | ||
artery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the Akka version is upgraded, it is now using artery rather than netty which is deprecated.
https://doc.akka.io/docs/akka/current/remoting.html#classic-remoting-deprecated-
BasicHttpService.startHttpService(FPCSchedulerServer.instance(scheduler).route, port, httpsConfig)(actorSystem) | ||
Http() | ||
.newServerAt("0.0.0.0", port = rpcPort) | ||
.bind(scheduler.serviceHandlers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to bind the gRPC port is used for invokers to communicate with schedulers to fetch activations.
vars: | ||
akka_env: | ||
"CONFIG_akka_actor_provider": "{{ scheduler.akka.provider }}" | ||
"CONFIG_akka_remote_artery_canonical_hostname": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required to specify the artery configurations.
"JMX_REMOTE": "{{ jmx.enabled }}" | ||
"PORT": "8080" | ||
|
||
"WHISK_SCHEDULER_ENDPOINTS_HOST": "{{ ansible_host }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three data are used to put a scheduler endpoint to ETCD so that other components can refer to.
ansible/group_vars/all
Outdated
@@ -435,8 +442,9 @@ metrics: | |||
|
|||
user_events: "{{ user_events_enabled | default(false) | lower }}" | |||
|
|||
durationChecker: | |||
timeWindow: "{{ duration_checker_time_window | default('1 d') }}" | |||
zerodowntimeDeployment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configuration is used to deploy all components without any downtime in downstream.
Since there are no corresponding ansible steps on the controller and invoker sides, it would not work but I kept it as is.
We can add the required changes in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zerodowntimeDeployment: | |
zeroDowntimeDeployment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How complicated are the required changes? And how exactly does the zero downtime work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to focus on running the new scheduler in this PR.
I believe it's not that difficult.
The procedure is like
- Disable some invokers.
- Check if any activations running on them.
- If yes, wait until they are complete. If not, redeploy invokers.
- Disable some schedulers.
- Check if any activations are in the schedulers.
- If yes, wait until they are forwarded to other schedulers. If not, redeploy schedulers.
- Remove some controllers from Nginx.
- Disable controllers.
- Wait until all activations finish.
- Redeploy controllers.
- Repeat 1~10 steps until most of the components are redeployed.
- For the last controller, scheduler, invoker, change the order to controller -> scheduler -> invoker.
@@ -40,6 +40,10 @@ dependencies { | |||
compile "com.typesafe.akka:akka-actor_${gradle.scala.depVersion}:${gradle.akka.version}" | |||
compile "com.typesafe.akka:akka-stream_${gradle.scala.depVersion}:${gradle.akka.version}" | |||
compile "com.typesafe.akka:akka-slf4j_${gradle.scala.depVersion}:${gradle.akka.version}" | |||
compile "com.typesafe.akka:akka-cluster_${gradle.scala.depVersion}:${gradle.akka.version}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to clarify here is, in our(downstream) production system, we are still sticking to akka-2.5.26
and netty-based transport for akka-remote.
So there can be some differences.
queueManager: | ||
maxSchedulingTime: "{{ scheduler_maxSchedulingTime | default('20 second') }}" | ||
maxRetriesToGetQueue: "{{ scheduler_maxRetriesToGetQueue | default(13) }}" | ||
queue: | ||
# the queue's state Running timeout, e.g. if have no activation comes into queue when Running, the queue state will be changed from Running to Idle and delete the decision algorithm actor | ||
idleGrace: "{{ scheduler_queue_idleGrace | default('20 seconds') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you said the default was to remove an idle queue after 24 hours?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 24 hours does not fit all cases.
I mentioned it because we are using 24 hours.
I am ok to change the default value but I believe each downstream will also choose the proper timeout other than the default.
The default configuration here means a queue will become idle after 20 seconds and be terminated after another 20 seconds.
In the idle status, the queue will not run any decision-making to add containers and get back to running whenever a new activation comes.
"LIMITS_ACTIONS_SEQUENCE_MAXLENGTH": "{{ limits.sequenceMaxLength }}" | ||
|
||
"CONFIG_whisk_couchdb_protocol": "{{ db.protocol }}" | ||
"CONFIG_whisk_couchdb_host": "{{ db.host }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember what does the scheduler need to communicate with couchdb for if anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through the code I see one has to get concurrentInvocations
limit to track throttling in the scheduler and sometimes the scheduler fails the activation in the queue which writes the activation in the queue and then acks to the controller before ever sending to the invoker
"{{ kafka.ssl.keystore.password }}" | ||
"ZOOKEEPER_HOSTS": "{{ zookeeper_connect_string }}" | ||
|
||
"LIMITS_ACTIONS_INVOKES_PERMINUTE": "{{ limits.invocationsPerMinute }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these throttling limits needed to be configured for the scheduler component?
|
||
- name: populate environment variables for scheduler | ||
set_fact: | ||
env: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there's a lot of configs in here not used by the scheduler, but I could be wrong. Just confusing for what's actually used by the scheduler and what is just copy paste from other components that doesn't get used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would review all configurations again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed all configurations and removed unnecessary things.
One thing I am uncertain about is this one.
c2e57c1#diff-168a0313e0b46601cf796cd58db3a422a426015b91de821de1951bc88c62db7cR186
It is also being used in a controller, but I couldn't find any reference.
https://github.com/apache/openwhisk/blob/master/ansible/roles/controller/tasks/deploy.yml#L234
@bdoyle0182 Do you have any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to update this, this will make my life easier digesting what we need to setup in config since we don't use ansible
@@ -197,6 +209,11 @@ whisk { | |||
retention-bytes = 1073741824 | |||
retention-ms = 3600000 | |||
} | |||
actions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what produces to and consumes from this topic? And what is produced?
edit: Looking at the Scheduler.scala
file on line 332 it seems like this config is just used as the configuration for the scheduler[idx]
topics, not actually topics created called actions? With that said should this config be renamed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this configuration is for topics, schedulerN
and these topics are used to forward activations from controllers to schedulers.
I feel it would be better to rename it to something like activations
or schedulers
.
@@ -46,7 +46,8 @@ class FPCPoolBalancer(config: WhiskConfig, | |||
extends LoadBalancer { | |||
|
|||
private implicit val executionContext: ExecutionContext = actorSystem.dispatcher | |||
private implicit val requestTimeout: Timeout = Timeout(5.seconds) | |||
// This value is given according to the total waiting time at QueueManager for a new queue to be created. | |||
private implicit val requestTimeout: Timeout = Timeout(8.seconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this raised because it took longer than 5 seconds to create for you? If so should it be raised quite a bit higher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just increased this to align it with the retry timeout at the queue manager.
https://github.com/apache/openwhisk/blob/master/core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/QueueManager.scala#L421
In the default configuration, the queue manager waits for a new queue to be created for around 8 seconds.
Regarding this value, it should not be too high because if a target scheduler becomes unresponsive, the controller should try to create a queue against another healthy scheduler.
This change might need to be reverted because controllers would retry to create queues after 8 seconds but the activations would already be dropped by the queue manager.
I quickly run benchmarks with the new scheduler and got the following results. Test Environments
I invoked a different number of actions as follow: 100 actions.1 actions.There are some differences between the two. In our downstream version, I could observe around 14000TPS in the same environment for both cases(1 action / 100 actions). In the upstream version, as you can see, it shows more TPS in the 100 actions case while it shows poor performance in the 1 action case. I feel there is still room for improvement in terms especially in terms of performance. I would also run the same benchmark with the old scheduler and compare the performance. |
b7db9fa
to
55e5f73
Compare
55e5f73
to
c2e57c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiangpengcheng @ningyougang
Could you cross-check the unnecessary configurations?
I could run the scheduler without those configurations.
|
||
- name: populate environment variables for scheduler | ||
set_fact: | ||
env: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed all configurations and removed unnecessary things.
One thing I am uncertain about is this one.
c2e57c1#diff-168a0313e0b46601cf796cd58db3a422a426015b91de821de1951bc88c62db7cR186
It is also being used in a controller, but I couldn't find any reference.
https://github.com/apache/openwhisk/blob/master/ansible/roles/controller/tasks/deploy.yml#L234
@bdoyle0182 Do you have any idea?
It's ready to review. |
I'm not sure about the config you mentioned me on. As far as I know, we don't use it anywhere. My guess was it was for a prefix to find runtimes in your registry based on what's defined in your manifest. So maybe it's needed in both? Or it's an outdated config that isn't even used anymore. |
**common/scala/src/main/resources** | ||
``` | ||
whisk.spi { | ||
ArtifactStoreProvider = org.apache.openwhisk.core.database.CouchDbStoreProvider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mongodb is technically supported now too right for this? It's just documentation example so not a big deal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so there is no sepecific dependency with CouchDB.
But I haven't tried with MongoDB.
|
||
**common/scala/src/main/resources** | ||
``` | ||
whisk.spi { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't a provider for the scheduler db be required here? i.e. ETCDProvider
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have the ETCDProvider
?
If we need to use a key-value store other than ETCD, we can introduce the layer too.
But for now, I believe the code is sticking with ETCD.
You can enable the new scheduler of OpenWhisk. | ||
It will run one more component called "scheduler" and ETCD. | ||
|
||
#### Configure service providers for the scheduler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be any guide on setting up etcd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you enable the scheduler by configuring enable_scheduler: true
, it will be automatically deployed by this.
https://github.com/apache/openwhisk/pull/5194/files#diff-2356bb62c87e471ef37b7973eb51e82282ef1131ee7ab4b62d909102de96967cR23
Codecov Report
@@ Coverage Diff @@
## master #5194 +/- ##
===========================================
+ Coverage 44.49% 72.59% +28.10%
===========================================
Files 237 238 +1
Lines 13922 13937 +15
Branches 593 581 -12
===========================================
+ Hits 6194 10117 +3923
+ Misses 7728 3820 -3908
Continue to review full report at Codecov.
|
I checked that |
timeWindow: "{{ duration_checker_time_window | default('1 d') }}" | ||
zeroDowntimeDeployment: | ||
enabled: "{{ zerodowntime_deployment_switch | default(true) }}" | ||
solution: "{{ zerodowntime_deployment_solution | default('apicall') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems apicall
is not readable here, how about use scroll
or other readable words? anyway, it means scroll to deploy
.
Another deployment solution is half
, which means a small blue/green deployment solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, this configuration does not take effect due to some missing parts.
I believe we can address what you mentioned in the subsequent PR when we introduce a feature for zero-downtime deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
I will merge this PR in 24 hours. |
elastic_index_pattern: <your elasticsearch index pattern> | ||
elastic_base_volume: <your elasticsearch volume directory> | ||
elastic_username: <your elasticsearch username> | ||
elastic_password: <your elasticsearch username> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use external elasticsearch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
It's exactly the same as the way CouchDB works with OW.
LGTM just with 2 small questions. |
Description
This PR includes changes to run the new scheduler.
I could run the scheduler on MacOS with these changes.
My changes affect the following components
Types of changes
Checklist: