Skip to content

Commit

Permalink
[kubernetes][release] K8s release test instructions (ray-project#16662)
Browse files Browse the repository at this point in the history
  • Loading branch information
DmitriGekhtman committed Jun 29, 2021
1 parent c318293 commit 257d072
Show file tree
Hide file tree
Showing 8 changed files with 62 additions and 8 deletions.
14 changes: 9 additions & 5 deletions python/ray/tests/kubernetes_e2e/test_k8s_operator_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,14 +66,17 @@ def f_with_retries(*args, **kwargs):


@retry_until_true
def wait_for_pods(n, namespace=NAMESPACE):
def wait_for_pods(n, namespace=NAMESPACE, name_filter=""):
client = kubernetes.client.CoreV1Api()
pods = client.list_namespaced_pod(namespace=namespace).items
# Double-check that the correct image is use.
count = 0
for pod in pods:
assert pod.spec.containers[0].image == IMAGE,\
pod.spec.containers[0].image
return len(pods) == n
if name_filter in pod.metadata.name:
count += 1
# Double-check that the correct image is use.
assert pod.spec.containers[0].image == IMAGE,\
pod.spec.containers[0].image
return count == n


@retry_until_true
Expand Down Expand Up @@ -320,6 +323,7 @@ def method(self):
print(">>>Submitting a job to test Ray client connection.")
cmd = f"kubectl -n {NAMESPACE} create -f {job_file.name}"
subprocess.check_call(cmd, shell=True)
wait_for_pods(1, name_filter="job")
job_pod = [pod for pod in pods() if "job" in pod].pop()
time.sleep(10)
wait_for_job(job_pod)
Expand Down
3 changes: 1 addition & 2 deletions release/RELEASE_CHECKLIST.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,7 @@ This checklist is meant to be used in conjunction with the RELEASE_PROCESS.rst d
- [ ] Scalability Envelope Tests
- [ ] ASAN Test
- [ ] K8s Test
- [ ] K8s cluster launcher test
- [ ] K8s operator test
- [ ] K8s operator and helm tests
- [ ] Data processing tests
- [ ] streaming_shuffle
- [ ] dask on ray test
Expand Down
2 changes: 1 addition & 1 deletion release/RELEASE_PROCESS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ is generally the easiest way to run release tests.

7. **K8s operator tests**

Run the ``python/ray/tests/test_k8s_*`` to make sure K8s cluster launcher and operator works. Make sure the docker image is the released version.
Refer to ``kubernetes_tests/README.md``. These tests verify basic functionality of the Ray Operator and Helm chart.

8. **Data processing tests**

Expand Down
18 changes: 18 additions & 0 deletions release/kubernetes_tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# ray-k8s-tests

How to run
1. Configure kubectl and Helm 3 to access a K8s cluster.
2. `git checkout releases/<release version>`
3. You might have to locally pip install the Ray wheel for the relevant commit (or pip install -e) in a conda env, see Ray client note below.
4. cd to this directory
3. `IMAGE=rayproject/ray:<release version> bash k8s_ci.sh`

This runs three tests and does the necessary resource creation/teardown. The tests typically take about 15 minutes to finish.
Notes:
1. Your Ray cluster should be able to accomodate 30 1-CPU pods to run all of the tests.
2. These tests use basic Ray client functionality -- your locally installed Ray version may need to be updated to match the one in the release image.
3. The tests do a poor job of Ray client port-forwarding process clean-up -- if a test fails, it's possible there might be a port-forwarding process stuck running in the background. To identify the rogue process run `ps aux | grep "port-forward"`. Then `kill` it.
4. There are some errors that will appear on the screen during the run -- that's normal, error recovery is being tested.

To run any of the three individual tests, substitute in step 4 above `k8s-test.sh` or `helm-test.sh` or `k8s-test-scale.sh`.
It's the last of these that needs 30 1-cpu pods. 10 is enough for either of the other two.
8 changes: 8 additions & 0 deletions release/kubernetes_tests/helm-test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
set -x
kubectl create namespace helm-test
kubectl create namespace helm-test2
KUBERNETES_OPERATOR_TEST_NAMESPACE=helm-test KUBERNETES_OPERATOR_TEST_IMAGE="$IMAGE" python ../../python/ray/tests/kubernetes_e2e/test_helm.py
kubectl delete namespace helm-test
kubectl delete namespace helm-test2
kubectl delete -f ../../deploy/charts/ray/crds/cluster_crd.yaml
11 changes: 11 additions & 0 deletions release/kubernetes_tests/k8s-test-scale.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash
set -x
kubectl create namespace scale-test
kubectl create namespace scale-test2
KUBERNETES_OPERATOR_TEST_NAMESPACE=scale-test KUBERNETES_OPERATOR_TEST_IMAGE="$IMAGE" python ../../python/ray/tests/kubernetes_e2e/test_k8s_operator_scaling.py
kubectl -n scale-test delete --all rayclusters
kubectl -n scale-test2 delete --all rayclusters
kubectl delete -f ../../deploy/components/operator_cluster_scoped.yaml
kubectl delete namespace scale-test
kubectl delete namespace scale-test2
kubectl delete -f ../../deploy/charts/ray/crds/cluster_crd.yaml
9 changes: 9 additions & 0 deletions release/kubernetes_tests/k8s-test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/bin/bash
set -x
kubectl create namespace basic-test
kubectl apply -f ../../deploy/charts/ray/crds/cluster_crd.yaml
KUBERNETES_OPERATOR_TEST_NAMESPACE=basic-test KUBERNETES_OPERATOR_TEST_IMAGE="$IMAGE" python ../../python/ray/tests/kubernetes_e2e/test_k8s_operator_basic.py
kubectl -n basic-test delete --all rayclusters
kubectl -n basic-test delete deployment ray-operator
kubectl delete namespace basic-test
kubectl delete -f ../../deploy/charts/ray/crds/cluster_crd.yaml
5 changes: 5 additions & 0 deletions release/kubernetes_tests/k8s_ci.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash
set -x
IMAGE="$IMAGE" bash k8s-test.sh
IMAGE="$IMAGE" bash helm-test.sh
IMAGE="$IMAGE" bash k8s-test-scale.sh

0 comments on commit 257d072

Please sign in to comment.