Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Add TLS/SSL support to gRPC channels #18631

Merged
merged 80 commits into from
Oct 21, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
efe18dd
Add use_tls_ member to GrpcServer
oscarknagg Sep 8, 2021
d38af35
Hacky TLS
oscarknagg Sep 8, 2021
3b5f210
Create secure gRPC channels in Python code
oscarknagg Sep 8, 2021
01c5cd9
Remove unecessary std::cout
oscarknagg Sep 8, 2021
2769675
More TLS
oscarknagg Sep 8, 2021
c6ad485
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 8, 2021
2962be3
Linting
oscarknagg Sep 8, 2021
64be21a
Add secure grpc in tests
oscarknagg Sep 8, 2021
d38e2b0
Fix secure grpc server initialisation
oscarknagg Sep 8, 2021
7aaa8ac
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 8, 2021
1668ecc
Use single environment variable as feature flag
oscarknagg Sep 9, 2021
621cfc7
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 9, 2021
a2c49d6
Pass environment in test_client_builder.py
oscarknagg Sep 9, 2021
0b73c38
Read RAY_USE_TLS in client worker
oscarknagg Sep 9, 2021
ddc8749
Unify init_grpc_channel and init_aiogrpc_channel functions
oscarknagg Sep 10, 2021
966fc49
Make function to add port to grpc server
oscarknagg Sep 10, 2021
b173b78
Upgrade to mTLS
oscarknagg Sep 10, 2021
bc39b8f
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 10, 2021
65361a2
Function to load certs from env variables
oscarknagg Sep 10, 2021
2dcff3a
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 13, 2021
f19e7a7
Add example cluster yaml which generates self-signed keys
oscarknagg Sep 13, 2021
b57c2e2
Add TLS auth test
oscarknagg Sep 14, 2021
a4cc458
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 14, 2021
65f0080
Add some fixtures to run test_basic.py with TLS auth
oscarknagg Sep 15, 2021
da45c78
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 15, 2021
b4dc0ca
Fix test_tls_auth.py
oscarknagg Sep 15, 2021
16c0cb3
Remove duplicated ReadFile function
oscarknagg Sep 15, 2021
30bebae
Formatting
oscarknagg Sep 15, 2021
c551c30
Remove EKS cluster YAML
oscarknagg Sep 15, 2021
1fa0fbf
Don't assume TLS env vars are set
oscarknagg Sep 15, 2021
2b0bc68
Add cryptography requirement to generate testing certs
oscarknagg Sep 15, 2021
ef5025a
Linting
oscarknagg Sep 15, 2021
d79fdd7
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 16, 2021
de36d6a
Fix new_dashboard->dashboard merge
oscarknagg Sep 16, 2021
92627a8
Remove possibility of nullptr from RAY_USE_TLS
oscarknagg Sep 16, 2021
d3b47dc
clang-format 7.0.0 linting
oscarknagg Sep 16, 2021
08fc4b0
Linting
oscarknagg Sep 16, 2021
a70a355
Fix failing test_grpc_credentials test
oscarknagg Sep 16, 2021
cd613df
Make dashboard head classes use async grpc again
oscarknagg Sep 16, 2021
b296a8a
Add test_tls_auth to BUILD
oscarknagg Sep 16, 2021
1cc7744
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 17, 2021
5528b51
Relax cryptography requirement
oscarknagg Sep 20, 2021
69f0618
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 20, 2021
c77d97a
Lint
oscarknagg Sep 20, 2021
3cf6271
Worker._secure looks at env var
oscarknagg Sep 20, 2021
b84dbe6
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 23, 2021
b82c932
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Sep 27, 2021
ddfa148
Apply changes from ci/travis/lint.sh
oscarknagg Sep 27, 2021
32acd64
Skip TLS tests on MacOS
oscarknagg Sep 27, 2021
d04fe6d
format.sh changes
oscarknagg Sep 27, 2021
53896b3
Address comments
oscarknagg Sep 27, 2021
09884ad
Merge branch 'tls' of github.com:oscarknagg/ray into tls
oscarknagg Oct 12, 2021
aea3e4e
Revert "Address comments"
oscarknagg Oct 12, 2021
8f02386
Merge master
oscarknagg Oct 12, 2021
6b0bced
Merge branch 'master' into tls
oscarknagg Oct 12, 2021
f57a61e
Merge remote-tracking branch 'upstream/master' into tls
oscarknagg Oct 12, 2021
78bbb34
Squashed commit of the following:
oscarknagg Oct 12, 2021
7639a65
Replace getenv with RayConfig
oscarknagg Oct 12, 2021
d95419a
Remove lingering errors from earlier merge
oscarknagg Oct 12, 2021
1c92af2
Address comments pt2
oscarknagg Oct 14, 2021
f2e1e55
Merge remote-tracking branch 'upstream/master' into tls
oscarknagg Oct 14, 2021
7c3f7b2
Tidy up
oscarknagg Oct 14, 2021
8d204c5
Hopefully fix lint
oscarknagg Oct 15, 2021
6954178
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 15, 2021
94a52ae
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 18, 2021
74d1652
Lint
oscarknagg Oct 18, 2021
c96043d
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 18, 2021
50c2da2
Remove unecessary logic in ray_config_def.h
oscarknagg Oct 19, 2021
8599854
Actually check for ConnectionError in test_client_connect_to_tls_server
oscarknagg Oct 19, 2021
60355a2
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 19, 2021
9dfd106
Remove unused ReadFile declaration
oscarknagg Oct 19, 2021
4feae45
Lint
oscarknagg Oct 19, 2021
5b57d7d
Replace grpc.insercure_channel with ray._private.utils.init_grpc_chan…
oscarknagg Oct 19, 2021
a600eaf
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 19, 2021
67d32b7
Trigger retest
ericl Oct 19, 2021
6fa08dc
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 20, 2021
f4032f1
Attempt to fix windows build
oscarknagg Oct 20, 2021
7fb64f0
Merge branch 'master' of https://github.com/ray-project/ray into tls
oscarknagg Oct 20, 2021
f4c8ae7
Merge branch 'master' into tls
ericl Oct 21, 2021
e74d707
Update worker.py
ericl Oct 21, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Address comments pt2
  • Loading branch information
oscarknagg committed Oct 14, 2021
commit 1c92af2068c48c9c4c954fb14e5fe26d9e715dcd
2 changes: 1 addition & 1 deletion python/ray/_private/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@
import sys
import time
import timeit
import socket
import math
import traceback
from typing import Optional, Any, List, Dict
from contextlib import redirect_stdout, redirect_stderr
import yaml
import socket
import pytest
import tempfile

Expand Down
11 changes: 9 additions & 2 deletions python/ray/_private/tls_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def generate_self_signed_tls_certs():


def add_port_to_grpc_server(server, address):
if os.environ.get("RAY_USE_TLS", "0") == "1":
if os.environ.get("RAY_USE_TLS", "0").lower() in ("1", "true"):
server_cert_chain, private_key, ca_cert = load_certs_from_env()
credentials = grpc.ssl_server_credentials(
[(private_key, server_cert_chain)],
Expand All @@ -71,7 +71,14 @@ def add_port_to_grpc_server(server, address):


def load_certs_from_env():
if os.environ.get("RAY_USE_TLS", "0") == "1":
if os.environ.get("RAY_USE_TLS", "0").lower() in ("1", "true"):
if ("RAY_TLS_SERVER_CERT" not in os.environ) or \
("RAY_TLS_SERVER_KEY" not in os.environ):
raise RuntimeError(
"If the environment variable RAY_USE_TLS is set to true"
"then both RAY_TLS_SERVER_CERT and RAY_TLS_SERVER_KEY must "
"also be set.")

with open(os.environ["RAY_TLS_SERVER_CERT"], "rb") as f:
server_cert_chain = f.read()
with open(os.environ["RAY_TLS_SERVER_KEY"], "rb") as f:
Expand Down
2 changes: 1 addition & 1 deletion python/ray/_private/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1116,7 +1116,7 @@ def init_grpc_channel(address: str,
options: Optional[Sequence[Tuple[str, Any]]] = None,
asynchronous: bool = False):
grpc_module = aiogrpc if asynchronous else grpc
if os.environ.get("RAY_USE_TLS", "0") == "1":
if os.environ.get("RAY_USE_TLS", "0").lower() in ("1", "true"):
server_cert_chain, private_key, ca_cert = load_certs_from_env()
credentials = grpc.ssl_channel_credentials(
certificate_chain=server_cert_chain,
Expand Down
6 changes: 2 additions & 4 deletions python/ray/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,17 +89,15 @@ def ray_start_regular_shared(request):


@pytest.fixture(
scope="module",
params=[{
scope="module", params=[{
"local_mode": True
}, {
"local_mode": False
}])
def ray_start_shared_local_modes(request):
param = getattr(request, "param", {})
use_tls = param.pop("use_tls", False)
with _ray_start(**param) as res:
yield res
yield res


@pytest.fixture
Expand Down
2 changes: 1 addition & 1 deletion python/ray/tests/test_client_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def test_namespace():
put in the same namespace.

This test checks that:
RayConfig::instance().RAY_USE_TLS() * When two drivers don't specify a namespace, they are placed in different
* When two drivers don't specify a namespace, they are placed in different
anonymous namespaces.
* When two drivers specify a namespace, they collide.
* The namespace name (as provided by the runtime context) is correct.
Expand Down
118 changes: 87 additions & 31 deletions python/ray/tests/test_tls_auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,37 @@
import logging
import os
import sys
import subprocess

import pytest

import ray
from ray._private.test_utils import run_string_as_driver

logger = logging.getLogger(__name__)


@pytest.mark.parametrize("use_tls", [True], indirect=True)
def test_put_get_with_tls(shutdown_only, use_tls):
ray.init(num_cpus=0)
def test_init_with_tls(use_tls):
# Run as a new process to pick up environment variables set
# in the use_tls fixture
run_string_as_driver(
"""
import ray
try:
ray.init()
finally:
ray.shutdown()
""",
env=os.environ)


@pytest.mark.parametrize("use_tls", [True], indirect=True)
def test_put_get_with_tls(use_tls):
run_string_as_driver(
"""
import ray
ray.init()
try:
for i in range(100):
value_before = i * 10**6
object_ref = ray.put(value_before)
Expand All @@ -37,45 +56,82 @@ def test_put_get_with_tls(shutdown_only, use_tls):
object_ref = ray.put(value_before)
value_after = ray.get(object_ref)
assert value_before == value_after
finally:
ray.shutdown()
""",
env=os.environ)


@pytest.mark.parametrize("use_tls", [True], indirect=True)
def test_submit_with_tls(shutdown_only, use_tls):
ray.init(num_cpus=2, num_gpus=1, resources={"Custom": 1})
@pytest.mark.parametrize("use_tls", [True], indirect=True, scope="module")
def test_submit_with_tls(use_tls):
run_string_as_driver(
"""
import ray
ray.init(num_cpus=2, num_gpus=1, resources={"Custom": 1})

@ray.remote
def f(n):
return list(range(n))
@ray.remote
def f(n):
return list(range(n))

id1, id2, id3 = f._remote(args=[3], num_returns=3)
assert ray.get([id1, id2, id3]) == [0, 1, 2]
id1, id2, id3 = f._remote(args=[3], num_returns=3)
assert ray.get([id1, id2, id3]) == [0, 1, 2]

@ray.remote
class Actor:
def __init__(self, x, y=0):
self.x = x
self.y = y
@ray.remote
class Actor:
def __init__(self, x, y=0):
self.x = x
self.y = y

def method(self, a, b=0):
return self.x, self.y, a, b
def method(self, a, b=0):
return self.x, self.y, a, b

a = Actor._remote(
args=[0], kwargs={"y": 1}, num_gpus=1, resources={"Custom": 1})
a = Actor._remote(
args=[0], kwargs={"y": 1}, num_gpus=1, resources={"Custom": 1})

id1, id2, id3, id4 = a.method._remote(
args=["test"], kwargs={"b": 2}, num_returns=4)
assert ray.get([id1, id2, id3, id4]) == [0, 1, "test", 2]
id1, id2, id3, id4 = a.method._remote(
args=["test"], kwargs={"b": 2}, num_returns=4)
assert ray.get([id1, id2, id3, id4]) == [0, 1, "test", 2]
""",
env=os.environ)


@pytest.mark.skipif(
sys.platform == "darwin",
reason=("Cryptography doesn't install in Mac build pipeline"))
@pytest.mark.parametrize("use_tls", [True], indirect=True)
def test_client_connect_to_tls_server(use_tls, init_and_serve):
from ray.util.client import ray as ray_client
os.environ["RAY_USE_TLS"] = "0"
with pytest.raises(ConnectionError):
ray_client.connect("localhost:50051")

os.environ["RAY_USE_TLS"] = "1"
ray_client.connect("localhost:50051")
def test_client_connect_to_tls_server(use_tls, call_ray_start):
for k, v in os.environ.items():
if k.startswith("RAY_"):
print("export {}={}".format(k, v))

tls_env = os.environ.copy(
) # use_tls fixture sets TLS environment variables
without_tls_env = {}

# Attempt to connect without TLS
try:
out = run_string_as_driver(
"""
from ray.util.client import ray as ray_client
ray_client.connect("localhost:10001")
""",
env=without_tls_env)
except subprocess.CalledProcessError as e:
assert "ConnectionError" in e.output.decode("utf-8")

# Attempt to connect with TLS
out = run_string_as_driver(
"""
import ray
from ray.util.client import ray as ray_client
ray_client.connect("localhost:10001")
print(ray.is_initialized())
""",
env=tls_env)
assert out == "True\n"


if __name__ == "__main__":
import pytest
import sys
sys.exit(pytest.main(["-v", __file__]))
9 changes: 5 additions & 4 deletions python/ray/util/client/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,8 @@ def __init__(
self.server = None
self._conn_state = grpc.ChannelConnectivity.IDLE
self._converted: Dict[str, ClientStub] = {}
self._secure = secure or os.environ.get("RAY_USE_TLS", "0") == "1"
self._secure = secure or os.environ.get("RAY_USE_TLS",
"0").lower() in ("1", "true")
self._conn_str = conn_str
self._connection_retries = connection_retries

Expand Down Expand Up @@ -160,7 +161,7 @@ def _connect_channel(self, reconnecting=False) -> None:
if self._secure:
if self._credentials is not None:
credentials = self._credentials
elif os.environ.get("RAY_USE_TLS", "0") == "1":
elif os.environ.get("RAY_USE_TLS", "0").lower() in ("1", "true"):
server_cert_chain, private_key, ca_cert = ray._private.utils \
.load_certs_from_env()
credentials = grpc.ssl_channel_credentials(
Expand Down Expand Up @@ -362,8 +363,8 @@ def get(self, vals, *, timeout: Optional[float] = None) -> Any:
logger.debug("Internal retry for get {}".format(to_get))
if len(to_get) != len(res):
raise Exception(
"Mismatched number of items in request ({}) and response ({})".
format(len(to_get), len(res)))
"Mismatched number of items in request ({}) and response ({})"
.format(len(to_get), len(res)))
if isinstance(vals, ClientObjectRef):
res = res[0]
return res
Expand Down
19 changes: 15 additions & 4 deletions src/ray/common/ray_config_def.h
Original file line number Diff line number Diff line change
Expand Up @@ -490,9 +490,20 @@ RAY_CONFIG(bool, scheduler_avoid_gpu_nodes, true)
RAY_CONFIG(bool, runtime_env_skip_local_gc, false)

/// Whether or not use TLS.
RAY_CONFIG(int64_t, USE_TLS, 0)
RAY_CONFIG(bool, USE_TLS,
std::getenv("RAY_USE_TLS") != nullptr &&
(std::getenv("RAY_USE_TLS") == std::string("true") ||
std::getenv("RAY_USE_TLS") == std::string("1")))
oscarknagg marked this conversation as resolved.
Show resolved Hide resolved

/// Location of TLS credentials
RAY_CONFIG(std::string, TLS_SERVER_CERT, "")
RAY_CONFIG(std::string, TLS_SERVER_KEY, "")
RAY_CONFIG(std::string, TLS_CA_CERT, "")
RAY_CONFIG(std::string, TLS_SERVER_CERT,
std::getenv("RAY_TLS_SERVER_CERT") != nullptr
? std::getenv("RAY_TLS_SERVER_CERT")
: "")
oscarknagg marked this conversation as resolved.
Show resolved Hide resolved
RAY_CONFIG(std::string, TLS_SERVER_KEY,
std::getenv("RAY_TLS_SERVER_KEY") != nullptr
? std::getenv("RAY_TLS_SERVER_KEY")
: "")
RAY_CONFIG(std::string, TLS_CA_CERT,
std::getenv("RAY_TLS_CA_CERT") != nullptr ? std::getenv("RAY_TLS_CA_CERT")
: "")
6 changes: 3 additions & 3 deletions src/ray/rpc/common.cc
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@
// See the License for the specific language governing permissions and
// limitations under the License.

#include "ray/rpc/common.h"

#include <fstream>
#include <sstream>

#include "ray/rpc/common.h"

namespace ray::rpc {

std::string ReadCert(const std::string &cert_filepath) {
Expand All @@ -26,4 +26,4 @@ std::string ReadCert(const std::string &cert_filepath) {
return buffer.str();
};

} // namespace rpc::ray
} // namespace ray::rpc
2 changes: 2 additions & 0 deletions src/ray/rpc/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
// See the License for the specific language governing permissions and
// limitations under the License.

#include <string>

namespace ray::rpc {

// Utility to read cert file from a particular location
Expand Down
6 changes: 3 additions & 3 deletions src/ray/rpc/grpc_client.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ template <class GrpcService>
class GrpcClient {
public:
GrpcClient(const std::string &address, const int port, ClientCallManager &call_manager,
bool use_tls = true)
bool use_tls = false)
: client_call_manager_(call_manager), use_tls_(use_tls) {
grpc::ChannelArguments argument;
// Disable http proxy since it disrupts local connections. TODO(ekl) we should make
Expand Down Expand Up @@ -107,7 +107,8 @@ class GrpcClient {
const std::string &address, int port) {
std::shared_ptr<grpc::Channel> channel;
if (::RayConfig::instance().USE_TLS()) {
std::string server_cert_file = std::string(::RayConfig::instance().TLS_SERVER_CERT());
std::string server_cert_file =
std::string(::RayConfig::instance().TLS_SERVER_CERT());
std::string server_key_file = std::string(::RayConfig::instance().TLS_SERVER_KEY());
std::string root_cert_file = std::string(::RayConfig::instance().TLS_CA_CERT());
std::string server_cert_chain = ReadCert(server_cert_file);
Expand All @@ -127,7 +128,6 @@ class GrpcClient {
}
return channel;
};

};

} // namespace rpc
Expand Down