Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection sharing is corrupted when replicating using a proxy #2271

Closed
kocolosk opened this issue Oct 24, 2019 · 0 comments · Fixed by #2280
Closed

Connection sharing is corrupted when replicating using a proxy #2271

kocolosk opened this issue Oct 24, 2019 · 0 comments · Fixed by #2280

Comments

@kocolosk
Copy link
Member

Description

In 2505436 we introduced an optimization to share connections to given host:port across different replications. I believe this functionality results in broken replications when using a forward proxy, as every connection that uses the proxy is tossed into the same pool, resulting in requests being directed to the wrong hosts.

Steps to Reproduce

  1. Set up a simple forward proxy. Squid via Homebrew worked for me.
  2. Configure a replication via the proxy where the source and target use different hosts. I used a database hosted in Cloudant as the source and a database on my dev box at 127.0.0.1:15984 as the target
  3. Watch as the replicator sends requests to the wrong locations. In my case I found that the replication crashed complaining about a 401 Unauthorized against my (admin party) dev setup, but when I turned on ibrowse tracing I saw that the response had actually come from Cloudant.

Expected Behaviour

Replication should replicate data.

Your Environment

CouchDB 3.0.0-201d5935c on macOS Catalina

Additional context

#1080 also talked about replication failing behind a proxy years ago, but I think this is a different issue. I do agree with the suggestion in that issue that configuring proxies separately for the source and the target makes a lot of sense.

As a test I tried the following patch to cause the shared connection pool to look at the final host:port instead of the proxy host:port

diff --git a/src/couch_replicator/src/couch_replicator_httpc.erl b/src/couch_replicator/src/couch_replicator_httpc.erl
index e4cf11606..576285983 100644
--- a/src/couch_replicator/src/couch_replicator_httpc.erl
+++ b/src/couch_replicator/src/couch_replicator_httpc.erl
@@ -47,10 +47,11 @@ setup(Db) ->
         http_connections = MaxConns,
         proxy_url = ProxyURL
     } = Db,
-    HttpcURL = case ProxyURL of
-        undefined -> Url;
-        _ when is_list(ProxyURL) -> ProxyURL
-    end,
+    % HttpcURL = case ProxyURL of
+    %     undefined -> Url;
+    %     _ when is_list(ProxyURL) -> ProxyURL
+    % end,
+    HttpcURL = Url,
     {ok, Pid} = couch_replicator_httpc_pool:start_link(HttpcURL,
         [{max_connections, MaxConns}]),
     case couch_replicator_auth:initialize(Db#httpdb{httpc_pool = Pid}) of

This did allow me to get the replication working. It's not a complete fix; if a server had different replications with a given host:port as the final endpoint, but some of them used a proxy and some did not, this code would still mix them together.

A more complete fix would be to expand the key used in the couch_replicator_connection module to include more attributes besides just a single URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant