PIP-121: Pulsar cluster level auto failover on client side #13315

hangc0276 · 2021-12-15T02:20:11Z

Motivation

We have geo-replication to support Pulsar cluster level failover. We can setup Pulsar cluster A as a primary cluster in data center A, and setup Pulsar cluster B as backup cluster in data center B. Then we configure geo-replication between cluster A and cluster B. All the clients are connected to the Pulsar cluster by DNS. If cluster A is down, we should switch the DNS to point the target Pulsar cluster from cluster A to cluster B. After the clients are resolved to cluster B, they can produce and consume messages normally. After cluster A recovers, the administrator should switch the DNS back to cluster A.

However, the current method has two shortcomings.

The administrator should monitor the status of all Pulsar clusters, and switch the DNS as soon as possible when cluster A is down. The switch and recovery is not automatic and recovery time is controlled by the administrator, which will put the administrator under heavy load.
The Pulsar client and DNS system have a cache. When the administrator switches the DNS from cluster A to Cluster B, it will take some time for cache trigger timeout, which will delay client recovery time and lead to the product/consumer message failing.

Goal

It's better to provide an automatic cluster level failure recovery mechanism to make pulsar cluster failover more effective. We should support pulsar clients auto switching from cluster A to cluster B when it detects cluster A has been down according to the configured detecting policy and switch back to cluster A when it has recovered. The reason why we should switch back to cluster A is that most applications may be deployed in data center A and they have low network cost for communicating with pulsar cluster A. If they keep visiting pulsar cluster B, they have high network cost, and cause high produce/consume latency.

In order to improve the DNS cache problem, we should provide an administrator controlled switch provider for administrators to update service URLs.

In the end, we should provide an auto service URL switch provider and administrator controlled switch provider.

Design

We have already provided the ServiceUrlProvider interface to support different service URLs. In order to support automatic cluster level failure auto recovery, we can provide different ServiceUrlProvider implementations. For current requirements, we can provide AutoClusterFailover and ControlledClusterFailover.

AutoClusterFailover

In order to support auto switching from the primary cluster to the secondary, we can provide a probe task, which will probe the activity of the primary cluster and the secondary one. When it finds the primary cluster failed more than failoverDelayMs, it will switch to the secondary cluster by calling updateServiceUrl. After switching to the secondary cluster, the AutoClusterFailover will continue to probe the primary cluster. If the primary cluster comes back and remains active for switchBackDelayMs, it will switch back to the primary cluster.
The APIs are listed as follows.

In order to support multiple secondary clusters, use List to store secondary cluster urls. When the primary cluster probe fails for failoverDelayMs, it will start to probe the secondary cluster list one by one, once it finds the active cluster, it will switch to the target cluster. Notice: If you configured multiple clusters, you should turn on cluster level geo-replication to ensure the topic data sync between all primary and secondary clusters. Otherwise, it may distribute the topic data into different clusters. And the consumers won’t get the whole data of the topic.

In order to support different authentication configurations between clusters, we provide the authentication relation configurations updated with the target cluster.

public class AutoClusterFailover implements ServiceUrlProvider {

   private AutoClusterFailover(AutoClusterFailoverBuilderImpl builder) {
// 
}

    @Override
    public void initialize(PulsarClient client) {
        this.pulsarClient = client;

        // start to probe primary cluster active or not
        executor.scheduleAtFixedRate(catchingAndLoggingThrowables(() -> {
            // probe and switch
        }), intervalMs, intervalMs, TimeUnit.MILLISECONDS);

    }

    @Override
    public String getServiceUrl() {
        return this.currentPulsarServiceUrl;
    }

    @Override
    public void close() {
        this.executor.shutdown();
    }
    
    // probe pulsar cluster available
    private boolean probeAvailable(String url, int timeout) {

    }

In order to create an AutoClusterFailover instance, we use AutoClusterFailoverBuilder interface to build the target instance. The AutoClusterFailoverBuilder interface is located in the pulsar-client-api package.

In the probeAvailable method, we will probe the Pulsar service port, and check whether the port is open. This probe method has many disadvantages, such as
We're connecting to a Pulsar proxy, but there are no available brokers
Using Istio on server side, which always accepts the connection even if the broker is in a bad state
We might have deadlocks in (all) brokers and while the connections get accepted, the brokers are not able to serve them.
In order to solve this problem, we’d better provide a health check command on the broker side, just like Zookeeper’s ruok command.
We can use the probe port method first, and in the next step, we will provide the health check command on the broker side.

ControlledClusterFailover

If the users want to control the cluster switch operation, they can provide the current service URL by a http service. The ControlledClusterFailover will get the newest service url from the provided http service periodically.
The APIs are listed as follows.

public class ControlledClusterFailover implements ServiceUrlProvider {

    private ControlledClusterFailover(String defaultServiceUrl, String urlProvider) throws IOException {
    }

    @Override
    public void initialize(PulsarClient client) {
        this.pulsarClient = client;

        // start to check service url every 30 seconds
        executor.scheduleAtFixedRate(catchingAndLoggingThrowables(() -> {
            // probe and switch
        }), interval, interval, TimeUnit.MILLISECONDS);

    }

    protected ControlledConfiguration fetchControlledConfiguration() throws IOException {
        // call the service to get controlled configuration
    }

    @Override
    public String getServiceUrl() {
        return this.currentPulsarServiceUrl;
    }

    @Override
    public void close() {
        this.executor.shutdown();
    }

protected static class ControlledConfiguration {
        private String serviceUrl;
        private String tlsTrustCertsFilePath;

        private String authPluginClassName;
        private String authParamsString;

}

The configuration we get from the third url provider, we define it as java Bean by json format. In the configuration, we provide authentication-related parameters to support different clusters that have different authentication configurations. These authentication-related parameters can support all current authentication plugin types.

In order to create an ControlledClusterFailover instance, we use ControlledClusterFailoverBuilder interface to build the target instance. The ControlledClusterFailoverBuilder interface is located in the pulsar-client-api package.

API Changes

For the current ServiceUrlProvider interface, we should add a close method to close an allocated resource, such as a timer thread.

public interface ServiceUrlProvider {
    /**
     * Close the resource that the provider allocated.
     *
     */
    default void close() {
        // do nothing
    }

    /**
     * Update the authentication this client is using.
     *
     * @param authentication
     *
     * @throws IOException
     */
    void updateAuthentication(Authentication authentication)
            throws IOException;

    /**
     * Update the tlsTrustCertsFilePath this client is using.
     *
     * @param tlsTrustCertsFilePath
     */
    void updateTlsTrustCertsFilePath(String tlsTrustCertsFilePath);

    /**
     * Update the tlsTrustStorePath and tlsTrustStorePassword this client is using.
     *
     * @param tlsTrustStorePath
     * @param tlsTrustStorePassword
     */
    void updateTlsTrustStorePathAndPassword(String tlsTrustStorePath, String tlsTrustStorePassword);

}

Tests

Add tests for the two service provider implementations.

For AutoClusterFailover, when the primary cluster shuts down, it should switch to the secondary cluster. And then the primary cluster came back, we should switch back.

For ControlledClusterFailover, when switching the service url on the http service side, it should switch to the newest service url.

The text was updated successfully, but these errors were encountered:

wangjialing218 · 2021-12-17T02:00:29Z

If cluster B has a different authentication data with cluster A, such as another token string, how to change the authentication setting in client when auto failover happen?

hangc0276 · 2021-12-21T01:12:07Z

If cluster B has a different authentication data with cluster A, such as another token string, how to change the authentication setting in client when auto failover happen?

Cluster A and Cluster B should be configured with the same authentication, otherwise the client should update authentication settings when switch cluster.

wangjialing218 · 2021-12-21T01:32:09Z

Is it possible to auto update authentication settings when switch cluster by implements AuthenticationDataProvider for AutoClusterFailover? We already support config geo-replicator cluster with different authentication in #10242.

hangc0276 · 2022-01-04T08:41:33Z

Is it possible to auto update authentication settings when switch cluster by implements AuthenticationDataProvider for AutoClusterFailover? We already support config geo-replicator cluster with different authentication in #10242.

@wangjialing218 Thanks for your suggestion, i have updated the design to support different authentication configuration for different clusters.

eolivelli · 2022-01-05T08:08:51Z

The proposal makes sense to me.
This is only a client side feature and only for the Java client (initially)

So I would like to see it stated in the title of the PIP

eolivelli · 2022-01-05T17:55:02Z

The PIP mentions primary and secondary.

Can we make it more general for an arbitrary number of clusters?

hpvd · 2022-01-05T19:05:45Z

The PIP mentions primary and secondary.

Can we make it more general for an arbitrary number of clusters?

@eolivelli +1 for this, was thinking in the same direction:
#13316 (comment)

hangc0276 · 2022-01-06T14:04:21Z

In order to support multiple secondary clusters, use List to store secondary cluster urls. When the primary cluster probe fails for failoverDelayMs, it will start to probe the secondary cluster list one by one, once it finds the active cluster, it will switch to the target cluster. Notice: If you configured multiple clusters, you should turn on cluster level geo-replication to ensure the topic data sync between all primary and secondary clusters. Otherwise, it may distribute the topic data into different clusters. And the consumers won’t get the whole data of the topic.

@eolivelli The secondary just represent the backup clusters, and I use a List to store secondary clusters to support arbitrary number of clusters.

hangc0276 · 2022-01-06T14:09:05Z

@eolivelli +1 for this, was thinking in the same direction:

@hpvd Thanks for your suggestion. Choose cluster from secondary cluster list according to probe latency is a good idea, I will add a parameter to configure choose policy to support it.

hangc0276 · 2022-01-06T14:11:32Z

The proposal makes sense to me. This is only a client side feature and only for the Java client (initially)

So I would like to see it stated in the title of the PIP

@eolivelli I updated the PIP title.

hpvd · 2022-01-06T14:14:14Z

@eolivelli +1 for this, was thinking in the same direction:

@hpvd Thanks for your suggestion. Choose cluster from secondary cluster list according to probe latency is a good idea, I will add a parameter to configure choose policy to support it.

@hangc0276 awesome! many thanks!

Related to #13315 ### Modification 1. add Pulsar cluster level auto failover

github-actions · 2022-02-26T01:50:14Z

The issue had no activity for 30 days, mark with Stale label.

hangc0276 · 2022-02-28T01:07:16Z

The implementation is #13316 , close this issue.

) Related to apache#13315 ### Modification 1. add Pulsar cluster level auto failover

cocktail828 · 2022-08-22T02:34:26Z

@hangc0276 你好, 看文档介绍, 受控的 failover 需要一个 service provider 管理元数据, 请问这个 provider 服务 Pulsar 会提供吗, 还是需要使用者自己写这个服务, 感谢!

hangc0276 added the type/PIP label Dec 15, 2021

hangc0276 mentioned this issue Dec 15, 2021

PIP-121: Pulsar cluster level auto failover on client side #13316

Merged

sijie mentioned this issue Dec 15, 2021

ISSUE-13315: PIP-121: Pulsar cluster level auto failover streamnative/pulsar-archived#3432

Closed

hangc0276 self-assigned this Dec 15, 2021

hangc0276 changed the title ~~PIP-121: Pulsar cluster level auto failover~~ PIP-121: Pulsar cluster level auto failover on client side Jan 6, 2022

codelipenghui pushed a commit that referenced this issue Jan 29, 2022

PIP-121: Pulsar cluster level auto failover on client side (#13316)

6e353b7

Related to #13315 ### Modification 1. add Pulsar cluster level auto failover

github-actions bot added the lifecycle/stale label Feb 26, 2022

hangc0276 closed this as completed Feb 28, 2022

Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this issue Apr 20, 2022

PIP-121: Pulsar cluster level auto failover on client side (apache#13316

1260a89

) Related to apache#13315 ### Modification 1. add Pulsar cluster level auto failover

graysonzeng mentioned this issue Aug 9, 2022

[proxy] proxy support multi pulsar cluster #17021

Closed

2 tasks

sijie mentioned this issue Aug 9, 2022

ISSUE-17021: [proxy] proxy support multi pulsar cluster streamnative/pulsar-archived#4699

Open

2 tasks

LeoSht mentioned this issue Jan 17, 2023

Add support for Pulsar cluster level auto failover (PIP-121) fsprojects/pulsar-client-dotnet#217

Open

lhotari mentioned this issue Feb 29, 2024

[improve][broker]PIP-340 Optimization of Probe Implementation for Automatic Failover #22133

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PIP-121: Pulsar cluster level auto failover on client side #13315

PIP-121: Pulsar cluster level auto failover on client side #13315

hangc0276 commented Dec 15, 2021 •

edited

Loading

wangjialing218 commented Dec 17, 2021

hangc0276 commented Dec 21, 2021

wangjialing218 commented Dec 21, 2021

hangc0276 commented Jan 4, 2022

eolivelli commented Jan 5, 2022

eolivelli commented Jan 5, 2022

hpvd commented Jan 5, 2022

hangc0276 commented Jan 6, 2022

hangc0276 commented Jan 6, 2022

hangc0276 commented Jan 6, 2022

hpvd commented Jan 6, 2022

github-actions bot commented Feb 26, 2022

hangc0276 commented Feb 28, 2022

cocktail828 commented Aug 22, 2022

PIP-121: Pulsar cluster level auto failover on client side #13315

PIP-121: Pulsar cluster level auto failover on client side #13315

Comments

hangc0276 commented Dec 15, 2021 • edited Loading

Motivation

Goal

Design

AutoClusterFailover

ControlledClusterFailover

API Changes

Tests

wangjialing218 commented Dec 17, 2021

hangc0276 commented Dec 21, 2021

wangjialing218 commented Dec 21, 2021

hangc0276 commented Jan 4, 2022

eolivelli commented Jan 5, 2022

eolivelli commented Jan 5, 2022

hpvd commented Jan 5, 2022

hangc0276 commented Jan 6, 2022

hangc0276 commented Jan 6, 2022

hangc0276 commented Jan 6, 2022

hpvd commented Jan 6, 2022

github-actions bot commented Feb 26, 2022

hangc0276 commented Feb 28, 2022

cocktail828 commented Aug 22, 2022

hangc0276 commented Dec 15, 2021 •

edited

Loading