[New Scheduler] Run scheduler #5194

style95 · 2022-01-18T11:29:59Z

Description

This PR includes changes to run the new scheduler.
I could run the scheduler on MacOS with these changes.

My changes affect the following components

Types of changes

Bug fix (generally a non-breaking change which closes an issue).
Enhancement or new feature (adds new functionality).
Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

I signed an Apache CLA.
I reviewed the style guides and followed the recommendations (Travis CI will check :).
I added tests to cover my changes.
My changes require further changes to the documentation.
I updated the documentation where necessary.

style95 · 2022-01-18T11:31:12Z

I am trying to deploy this version to a distributed environment and add a how-to-deploy document.

style95 · 2022-01-18T11:33:12Z

core/scheduler/src/main/resources/application.conf

- receive-buffer-size = 3151796b
- maximum-frame-size = 3151796b
+ remote {
+ artery {


As the Akka version is upgraded, it is now using artery rather than netty which is deprecated.
https://doc.akka.io/docs/akka/current/remoting.html#classic-remoting-deprecated-

style95 · 2022-01-18T11:34:20Z

core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/Scheduler.scala

- BasicHttpService.startHttpService(FPCSchedulerServer.instance(scheduler).route, port, httpsConfig)(actorSystem)
+ Http()
+ .newServerAt("0.0.0.0", port = rpcPort)
+ .bind(scheduler.serviceHandlers)


This is to bind the gRPC port is used for invokers to communicate with schedulers to fetch activations.

style95 · 2022-01-18T11:39:18Z

ansible/roles/schedulers/tasks/join_akka_cluster.yml

+ vars:
+ akka_env:
+ "CONFIG_akka_actor_provider": "{{ scheduler.akka.provider }}"
+ "CONFIG_akka_remote_artery_canonical_hostname":


This is required to specify the artery configurations.

style95 · 2022-01-18T11:40:44Z

ansible/roles/schedulers/tasks/deploy.yml

+ "JMX_REMOTE": "{{ jmx.enabled }}"
+ "PORT": "8080"
+
+ "WHISK_SCHEDULER_ENDPOINTS_HOST": "{{ ansible_host }}"


These three data are used to put a scheduler endpoint to ETCD so that other components can refer to.

style95 · 2022-01-18T11:43:25Z

ansible/group_vars/all

@@ -435,8 +442,9 @@ metrics:

 user_events: "{{ user_events_enabled | default(false) | lower }}"

-durationChecker:
- timeWindow: "{{ duration_checker_time_window | default('1 d') }}"
+zerodowntimeDeployment:


This configuration is used to deploy all components without any downtime in downstream.
Since there are no corresponding ansible steps on the controller and invoker sides, it would not work but I kept it as is.
We can add the required changes in the future.

Suggested change

zerodowntimeDeployment:

zeroDowntimeDeployment:

How complicated are the required changes? And how exactly does the zero downtime work?

I just want to focus on running the new scheduler in this PR.
I believe it's not that difficult.

The procedure is like

Disable some invokers.

Check if any activations running on them.

If yes, wait until they are complete. If not, redeploy invokers.

Disable some schedulers.

Check if any activations are in the schedulers.

If yes, wait until they are forwarded to other schedulers. If not, redeploy schedulers.

Remove some controllers from Nginx.

Disable controllers.

Wait until all activations finish.

Redeploy controllers.

Repeat 1~10 steps until most of the components are redeployed.

For the last controller, scheduler, invoker, change the order to controller -> scheduler -> invoker.

style95 · 2022-01-18T11:45:18Z

common/scala/build.gradle

@@ -40,6 +40,10 @@ dependencies {
 compile "com.typesafe.akka:akka-actor_${gradle.scala.depVersion}:${gradle.akka.version}"
 compile "com.typesafe.akka:akka-stream_${gradle.scala.depVersion}:${gradle.akka.version}"
 compile "com.typesafe.akka:akka-slf4j_${gradle.scala.depVersion}:${gradle.akka.version}"
+ compile "com.typesafe.akka:akka-cluster_${gradle.scala.depVersion}:${gradle.akka.version}"


One thing to clarify here is, in our(downstream) production system, we are still sticking to akka-2.5.26 and netty-based transport for akka-remote.
So there can be some differences.

bdoyle0182 · 2022-01-19T01:52:38Z

ansible/group_vars/all

 queueManager:
 maxSchedulingTime: "{{ scheduler_maxSchedulingTime | default('20 second') }}"
 maxRetriesToGetQueue: "{{ scheduler_maxRetriesToGetQueue | default(13) }}"
+ queue:
+ # the queue's state Running timeout, e.g. if have no activation comes into queue when Running, the queue state will be changed from Running to Idle and delete the decision algorithm actor
+ idleGrace: "{{ scheduler_queue_idleGrace | default('20 seconds') }}"


I thought you said the default was to remove an idle queue after 24 hours?

I think 24 hours does not fit all cases.
I mentioned it because we are using 24 hours.

I am ok to change the default value but I believe each downstream will also choose the proper timeout other than the default.

The default configuration here means a queue will become idle after 20 seconds and be terminated after another 20 seconds.
In the idle status, the queue will not run any decision-making to add containers and get back to running whenever a new activation comes.

bdoyle0182 · 2022-01-22T08:42:49Z

ansible/roles/schedulers/tasks/deploy.yml

+ "LIMITS_ACTIONS_SEQUENCE_MAXLENGTH": "{{ limits.sequenceMaxLength }}"
+
+ "CONFIG_whisk_couchdb_protocol": "{{ db.protocol }}"
+ "CONFIG_whisk_couchdb_host": "{{ db.host }}"


I can't remember what does the scheduler need to communicate with couchdb for if anything?

Going through the code I see one has to get concurrentInvocations limit to track throttling in the scheduler and sometimes the scheduler fails the activation in the queue which writes the activation in the queue and then acks to the controller before ever sending to the invoker

bdoyle0182 · 2022-01-22T08:43:22Z

ansible/roles/schedulers/tasks/deploy.yml

+ "{{ kafka.ssl.keystore.password }}"
+ "ZOOKEEPER_HOSTS": "{{ zookeeper_connect_string }}"
+
+ "LIMITS_ACTIONS_INVOKES_PERMINUTE": "{{ limits.invocationsPerMinute }}"


Are these throttling limits needed to be configured for the scheduler component?

bdoyle0182 · 2022-01-22T08:45:24Z

ansible/roles/schedulers/tasks/deploy.yml

+
+- name: populate environment variables for scheduler
+ set_fact:
+ env:


I feel like there's a lot of configs in here not used by the scheduler, but I could be wrong. Just confusing for what's actually used by the scheduler and what is just copy paste from other components that doesn't get used

I would review all configurations again.

I have reviewed all configurations and removed unnecessary things.

One thing I am uncertain about is this one.
c2e57c1#diff-168a0313e0b46601cf796cd58db3a422a426015b91de821de1951bc88c62db7cR186

It is also being used in a controller, but I couldn't find any reference.
https://github.com/apache/openwhisk/blob/master/ansible/roles/controller/tasks/deploy.yml#L234

@bdoyle0182 Do you have any idea?

Thanks for taking the time to update this, this will make my life easier digesting what we need to setup in config since we don't use ansible

bdoyle0182 · 2022-01-22T09:06:44Z

common/scala/src/main/resources/application.conf

@@ -197,6 +209,11 @@ whisk {
 retention-bytes = 1073741824
 retention-ms = 3600000
 }
+ actions {


what produces to and consumes from this topic? And what is produced?

edit: Looking at the Scheduler.scala file on line 332 it seems like this config is just used as the configuration for the scheduler[idx] topics, not actually topics created called actions? With that said should this config be renamed?

Yes, this configuration is for topics, schedulerN and these topics are used to forward activations from controllers to schedulers.
I feel it would be better to rename it to something like activations or schedulers.

bdoyle0182 · 2022-01-22T09:07:39Z

core/controller/src/main/scala/org/apache/openwhisk/core/loadBalancer/FPCPoolBalancer.scala

@@ -46,7 +46,8 @@ class FPCPoolBalancer(config: WhiskConfig,
 extends LoadBalancer {

 private implicit val executionContext: ExecutionContext = actorSystem.dispatcher
- private implicit val requestTimeout: Timeout = Timeout(5.seconds)
+ // This value is given according to the total waiting time at QueueManager for a new queue to be created.
+ private implicit val requestTimeout: Timeout = Timeout(8.seconds)


was this raised because it took longer than 5 seconds to create for you? If so should it be raised quite a bit higher?

I just increased this to align it with the retry timeout at the queue manager.
https://github.com/apache/openwhisk/blob/master/core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/QueueManager.scala#L421

In the default configuration, the queue manager waits for a new queue to be created for around 8 seconds.

Regarding this value, it should not be too high because if a target scheduler becomes unresponsive, the controller should try to create a queue against another healthy scheduler.

This change might need to be reverted because controllers would retry to create queues after 8 seconds but the activations would already be dropped by the queue manager.

style95 · 2022-01-24T09:12:00Z

I quickly run benchmarks with the new scheduler and got the following results.

Test Environments

3 controllers(VM): 8 cores, 16GB memory
3 schedulers(PM): 40 cores, 128GB memory
10 invokers(VM): 8 cores, 16GB memory.
- UserMemory: 10240MB
- Total 400 containers when utilizing 256GB-memory containers

I invoked a different number of actions as follow:

100 actions.

1 actions.

There are some differences between the two.
For the record, the upstream version is now different from the downstream version.
It is using a different version of the Akka family and there is also a subtle difference in the code base too.

In our downstream version, I could observe around 14000TPS in the same environment for both cases(1 action / 100 actions).

In the upstream version, as you can see, it shows more TPS in the 100 actions case while it shows poor performance in the 1 action case.
In the 100 actions case, it utilized all containers, while only a few containers were utilized in the 1 action case.

I feel there is still room for improvement in terms especially in terms of performance.
But anyway, I could confirm it is working as expected with regards to functionalities.

I would also run the same benchmark with the old scheduler and compare the performance.

style95 · 2022-01-28T06:29:18Z

I ran the same benchmark against the upstream master.
The test environment is same.

Test Environments

3 controllers(VM): 8 cores, 16GB memory
3 DBs(PM): 40 cores, 128GB memory
10 invokers(VM): 8 cores, 16GB memory.
- UserMemory: 10240MB
- Total 400 containers when utilizing 256GB-memory containers

1 actions.

100 actions.

There are a couple of errors because blocking calls became non-blocking calls after the 60s and the response code was 202 rather than 200.

2022-01-28 15:24:45,640 INFO  https://10.105.188.179/api/v1/namespaces/guest/actions/noop44?blocking=true&result=true&volatile=true -> 202 Accepted, 51 bytes
2022-01-28 15:24:45,641 ERROR 
Expected: is <200>
     got: <202>

java.lang.AssertionError: 
Expected: is <200>
     got: <202>

style95

@jiangpengcheng @ningyougang
Could you cross-check the unnecessary configurations?
I could run the scheduler without those configurations.

c2e57c1

style95 · 2022-01-29T06:57:39Z

ansible/roles/schedulers/tasks/deploy.yml

+
+- name: populate environment variables for scheduler
+ set_fact:
+ env:


I have reviewed all configurations and removed unnecessary things.

One thing I am uncertain about is this one.
c2e57c1#diff-168a0313e0b46601cf796cd58db3a422a426015b91de821de1951bc88c62db7cR186

It is also being used in a controller, but I couldn't find any reference.
https://github.com/apache/openwhisk/blob/master/ansible/roles/controller/tasks/deploy.yml#L234

@bdoyle0182 Do you have any idea?

style95 · 2022-02-03T07:47:21Z

It's ready to review.

bdoyle0182 · 2022-02-03T23:21:46Z

I'm not sure about the config you mentioned me on. As far as I know, we don't use it anywhere. My guess was it was for a prefix to find runtimes in your registry based on what's defined in your manifest. So maybe it's needed in both? Or it's an outdated config that isn't even used anymore.

bdoyle0182 · 2022-02-03T23:28:17Z

ansible/README.md

+**common/scala/src/main/resources**
+```
+whisk.spi {
+ ArtifactStoreProvider = org.apache.openwhisk.core.database.CouchDbStoreProvider


Mongodb is technically supported now too right for this? It's just documentation example so not a big deal

Yes, I think so there is no sepecific dependency with CouchDB.
But I haven't tried with MongoDB.

bdoyle0182 · 2022-02-03T23:32:02Z

ansible/README.md

+
+**common/scala/src/main/resources**
+```
+whisk.spi {


shouldn't a provider for the scheduler db be required here? i.e. ETCDProvider?

Do we have the ETCDProvider?
If we need to use a key-value store other than ETCD, we can introduce the layer too.
But for now, I believe the code is sticking with ETCD.

bdoyle0182 · 2022-02-03T23:41:24Z

ansible/README.md

+You can enable the new scheduler of OpenWhisk.
+It will run one more component called "scheduler" and ETCD.
+
+#### Configure service providers for the scheduler


Will there be any guide on setting up etcd?

If you enable the scheduler by configuring enable_scheduler: true, it will be automatically deployed by this.
https://github.com/apache/openwhisk/pull/5194/files#diff-2356bb62c87e471ef37b7973eb51e82282ef1131ee7ab4b62d909102de96967cR23

codecov-commenter · 2022-02-04T07:42:38Z

Codecov Report

Merging #5194 (8da0f4d) into master (e172168) will increase coverage by 28.10%.
The diff coverage is 14.28%.

@@             Coverage Diff             @@
##           master    #5194       +/-   ##
===========================================
+ Coverage   44.49%   72.59%   +28.10%     
===========================================
  Files         237      238        +1     
  Lines       13922    13937       +15     
  Branches      593      581       -12     
===========================================
+ Hits         6194    10117     +3923     
+ Misses       7728     3820     -3908

Impacted Files	Coverage Δ
.../scala/org/apache/openwhisk/core/WhiskConfig.scala	`95.75% <ø> (+0.05%)`	⬆️
...rg/apache/openwhisk/core/scheduler/Scheduler.scala	`9.45% <0.00%> (+0.24%)`	⬆️
.../openwhisk/core/loadBalancer/FPCPoolBalancer.scala	`33.73% <33.33%> (-0.14%)`	⬇️
...whisk/connector/kafka/KafkaConsumerConnector.scala	`59.15% <0.00%> (-22.54%)`	⬇️
...pache/openwhisk/core/invoker/InvokerReactive.scala	`57.26% <0.00%> (-20.52%)`	⬇️
.../scala/org/apache/openwhisk/utils/Exceptions.scala	`20.00% <0.00%> (-20.00%)`	⬇️
...ache/openwhisk/core/database/ActivationStore.scala	`78.26% <0.00%> (-14.05%)`	⬇️
...a/org/apache/openwhisk/http/BasicHttpService.scala	`79.36% <0.00%> (-11.12%)`	⬇️
...pache/openwhisk/core/entity/ConcurrencyLimit.scala	`88.23% <0.00%> (-5.89%)`	⬇️
...pache/openwhisk/core/containerpool/Container.scala	`85.05% <0.00%> (-5.75%)`	⬇️
... and 136 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e172168...8da0f4d. Read the comment docs.

jiangpengcheng · 2022-02-07T00:49:41Z

I checked that CONFIG_whisk_runtimes_defaultImagePrefix is an outdated configuration

ningyougang · 2022-02-09T00:53:24Z

ansible/group_vars/all

- timeWindow: "{{ duration_checker_time_window | default('1 d') }}"
+zeroDowntimeDeployment:
+ enabled: "{{ zerodowntime_deployment_switch | default(true) }}"
+ solution: "{{ zerodowntime_deployment_solution | default('apicall') }}"


It seems apicall is not readable here, how about use scroll or other readable words? anyway, it means scroll to deploy.

Another deployment solution is half, which means a small blue/green deployment solution

Currently, this configuration does not take effect due to some missing parts.
I believe we can address what you mentioned in the subsequent PR when we introduce a feature for zero-downtime deployment.

style95 · 2022-02-09T00:56:56Z

I will merge this PR in 24 hours.
Please share your comment if you have any.

ningyougang · 2022-02-09T00:57:23Z

ansible/README.md

+elastic_index_pattern: <your elasticsearch index pattern>
+elastic_base_volume: <your elasticsearch volume directory>
+elastic_username: <your elasticsearch username>
+elastic_password: <your elasticsearch username>


Can we use external elasticsearch?

Yes.
It's exactly the same as the way CouchDB works with OW.

ningyougang · 2022-02-09T00:58:10Z

LGTM just with 2 small questions.

style95 added 5 commits January 18, 2022 17:52

Add Akka-cluster dependency

73e99e1

Update configurations to run the new scheduler.

2c7f324

Add gRPC handlers for activations.

40f3682

Update Ansible scripts to run the new scheduler.

3d346e0

Increase the queue creation request timeout.

4192a5a

style95 changed the title ~~Run scheduler~~ [New Scheduler] Run scheduler Jan 18, 2022

style95 added the wip label Jan 18, 2022

Add scheduler the ansible role.

2f7d7e3

style95 commented Jan 18, 2022

View reviewed changes

bdoyle0182 reviewed Jan 19, 2022

View reviewed changes

Fix typo.

dd9a6ce

bdoyle0182 reviewed Jan 22, 2022

View reviewed changes

Change the loglevel config to logback's one.

acd7866

Change the topic name

6499279

style95 force-pushed the run-scheduler branch from b7db9fa to 55e5f73 Compare January 29, 2022 06:52

Remove unnecessary configs

c2e57c1

style95 force-pushed the run-scheduler branch from 55e5f73 to c2e57c1 Compare January 29, 2022 06:53

style95 commented Jan 29, 2022

View reviewed changes

Add a guide how to deploy the new scheduler.

a18c2b6

style95 removed the wip label Feb 3, 2022

bdoyle0182 reviewed Feb 3, 2022

View reviewed changes

Make ActorSystem for each test bind to a free port.

8da0f4d

ningyougang reviewed Feb 9, 2022

View reviewed changes

ningyougang approved these changes Feb 9, 2022

View reviewed changes

bdoyle0182 approved these changes Feb 10, 2022

View reviewed changes

style95 merged commit 5332e6d into apache:master Feb 11, 2022

style95 mentioned this pull request Jul 31, 2022

Revert cycle handling. #5300

Merged

22 tasks

[New Scheduler] Run scheduler #5194

[New Scheduler] Run scheduler #5194

Conversation

style95 commented Jan 18, 2022

Description

My changes affect the following components

Types of changes

Checklist:

style95 commented Jan 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

style95 Jan 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

style95 commented Jan 24, 2022

Test Environments

100 actions.

1 actions.

style95 commented Jan 28, 2022

Test Environments

1 actions.

100 actions.

style95 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

style95 commented Feb 3, 2022

bdoyle0182 commented Feb 3, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdoyle0182 Feb 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Feb 4, 2022 • edited Loading

Codecov Report

jiangpengcheng commented Feb 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

style95 commented Feb 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ningyougang commented Feb 9, 2022

style95 Jan 23, 2022 •

edited

Loading

style95 left a comment •

edited

Loading

bdoyle0182 Feb 3, 2022 •

edited

Loading

codecov-commenter commented Feb 4, 2022 •

edited

Loading