Skip to content

Commit

Permalink
[Cluster launcher] [Azure] Make cluster termination and networking mo…
Browse files Browse the repository at this point in the history
…re configurable and robust (ray-project#44100)

This PR addresses a few issues when launching clusters with Azure:

Any changes made to subnets of the deployed virtual network(s) are bashed upon redeployment.
Any service endpoints, route tables, or delegations are removed when redeploying (which happens on any of the ray CLI calls) due to this open Azure issue. This PR provides a workaround for the issue by copying the existing subnet configuration into the deployment template if a subnet already exists with the cluster unique id within the same resource group.
VM termination is extremely lengthy and does not clean up all dependencies.
When VMs are provisioned, dependencies such as disks, NICs, and public IP addresses are also provisioned. However, because the termination process does not wait for the VM to be deleted and the dependent resources cannot be deleted at the same time as the VM, these dependencies are often left in the resource group after termination. This can cause issues with quotas (i.e., reaching a limit of public IP addresses or disks) and wastes resources. This PR moves node termination into a pool of threads so that node deletion can be parallelized (since waiting for each node to be deleted takes a long time) and all dependencies can be correctly deleted once their VMs no longer exist.
VMs can have status code ProvisioningState/failed/RetryableError, causing an unpacking error.
This line throws an exception when the provisioning state is the string above, resulting in incorrect provisioning/termination of the node. This PR addresses that issue by slicing the list of status strings and only using the first two.
The default quota for public IP addresses in Azure is only 100, which can result in quota limits being hit for larger clusters.
This PR adds an option (use_external_head_ip) for only provisioning a public IP address for the head node (instead of all nodes or no nodes). This allows a user to still communicate with the head node via a public IP address without running into quota limits on public IP addresses. This option works in tandem with use_internal_ips - if both are set to True, then a public IP address will only be provisioned for the head node. If use_external_head_ip is omitted, the behavior is unchanged from the current behavior (i.e., public IPs will be provisioned for all nodes if use_internal_ips is False, otherwise no public IPs will be provisioned).
I've tested all of these fixes using ray up/ray dashboard/ray down on Azure clusters of 4-32 nodes to make sure the start up/teardown works correctly and the correct amount of resources are provisioned.

Related issue number
Node termination times are discussed in ray-project#25971


---------

Signed-off-by: Mike Danielczuk <[email protected]>
Signed-off-by: Mike Danielczuk <[email protected]>
Co-authored-by: Scott Graham <[email protected]>
  • Loading branch information
mjd3 and gramhagen authored Mar 25, 2024
1 parent ac927f8 commit 408b1fb
Show file tree
Hide file tree
Showing 7 changed files with 272 additions and 73 deletions.
33 changes: 33 additions & 0 deletions doc/source/cluster/vms/references/ray-cluster-configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ Provider
:ref:`msi_resource_group <cluster-configuration-msi-resource-group>`: str
:ref:`cache_stopped_nodes <cluster-configuration-cache-stopped-nodes>`: bool
:ref:`use_internal_ips <cluster-configuration-use-internal-ips>`: bool
:ref:`use_external_head_ip <cluster-configuration-use-external-head-ip>`: bool
.. tab-item:: GCP

Expand Down Expand Up @@ -1169,6 +1170,38 @@ controlled by your cloud provider's configuration.
* **Type:** Boolean
* **Default:** ``False``

.. _cluster-configuration-use-external-head-ip:

``provider.use_external_head_ip``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. tab-set::

.. tab-item:: AWS

Not available.

.. tab-item:: Azure

If enabled, Ray will provision and use a public IP address for communication with the head node,
regardless of the value of ``use_internal_ips``. This option can be used in combination with
``use_internal_ips`` to avoid provisioning excess public IPs for worker nodes (i.e., communicate
among nodes using private IPs, but provision a public IP for head node communication only). If
``use_internal_ips`` is ``False``, then this option has no effect.

* **Required:** No
* **Importance:** Low
* **Type:** Boolean
* **Default:** ``False``

.. tab-item:: GCP

Not available.

.. tab-item:: vSphere

Not available.

.. _cluster-configuration-security-group:

``provider.security_group``
Expand Down
37 changes: 37 additions & 0 deletions python/ray/autoscaler/_private/_azure/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,43 @@ def _configure_resource_group(config):
subnet_mask = "10.{}.0.0/16".format(random.randint(1, 254))
logger.info("Using subnet mask: %s", subnet_mask)

# Copy over properties from existing subnet.
# Addresses issue (https://github.com/Azure/azure-quickstart-templates/issues/2786)
# where existing subnet properties will get overwritten unless explicitly specified
# during multiple deployments even if vnet/subnet do not change.
# May eventually be fixed by passing empty subnet list if they already exist:
# https://techcommunity.microsoft.com/t5/azure-networking-blog/azure-virtual-network-now-supports-updates-without-subnet/ba-p/4067952
list_by_rg = get_azure_sdk_function(
client=resource_client.resources, function_name="list_by_resource_group"
)
existing_vnets = list(
list_by_rg(
resource_group,
f"substringof('{unique_id}', name) and "
"resourceType eq 'Microsoft.Network/virtualNetworks'",
)
)
if len(existing_vnets) > 0:
vnid = existing_vnets[0].id
get_by_id = get_azure_sdk_function(
client=resource_client.resources, function_name="get_by_id"
)
subnet = get_by_id(vnid, resource_client.DEFAULT_API_VERSION).properties[
"subnets"
][0]
template_vnet = next(
(
rs
for rs in template["resources"]
if rs["type"] == "Microsoft.Network/virtualNetworks"
),
None,
)
if template_vnet is not None:
template_subnets = template_vnet["properties"].get("subnets")
if template_subnets is not None:
template_subnets[0]["properties"].update(subnet["properties"])

# Get or create an MSI name and resource group.
# Defaults to current resource group if not provided.
use_existing_msi = (
Expand Down
Loading

0 comments on commit 408b1fb

Please sign in to comment.