Remote change-stream causes zombie 'after save' handlers #4317

Wizarth · 2020-03-10T01:54:20Z

Steps to reproduce

Using the remote, connecting to change-stream creates an observer on the model that is never removed.

With this sandbox:
In separate shells

NODE_ENV=backend node server/server.js
NODE_ENV=frontend node server/server.js
node test.js

Current Behavior

backend is on localhost:3000
frontend is on localhost:3010

backend is using in-memory db for Test model.
frontend is using loopback-connector-remote, connecting to backend, for Test model.

server.js has a once per second check for the # of _observers['after save'] on the Test model.

test.js runs two tests. The first connects 100 times to http:https://localhost:3000/api/Tests/change-stream (using request). After ten seconds, these requests are aborted.

During this time, the backend loopback instance reports 100 after save observers. When the requests are disconnected, the number of observers drops to zero.

In the second test, 100 connections to http:https://localhost:3010/api/Tests/change-stream are created (the frontend). After ten seconds, these requests are aborted.

During this time, the backend loopback instance reports 100 after save observers. When the requests are disconnected, the number of observers does not drop.

Terminating the frontend instance does not cause the number of observers to drop.

Expected Behavior

Disconnecting the HTTP request should remove the observer.

Link to reproduction sandbox

https://github.com/Wizarth/loopback-sandbox/tree/change-stream_Observer

Additional information

Node version:

win32 x64 10.16.0
linux x64 8.14.0

Sandbox:
[email protected] C:...\loopback-sandbox
+-- [email protected]
+-- [email protected]
+-- [email protected]
+-- [email protected]
+-- [email protected]

We are using a setup where front-end Loopback servers are communicating to a single back-end Loopback server. The back-end server is primarily to centralize event handling, while the front end is doing per-session logic as much as possible. The front-end is connected to the backend using loopback-connector-remote (which uses strong-remoting).

The structure is NGINX -> front-end Loopback <> back end Loopback <> Various internal services.

We've set up a web app that uses the JS EventStream class to connect to the persisted models change-stream endpoint. What we're finding is that over time, the CPU usage of the back end Loopback increases every time the persisted model is updated. This CPU usage increases up to seeing multiple seconds of 100% processor usage by this process, which has all sorts of knock on bad effects.

By adding logging to loopback-datasource-juggler / lib/observer.js , we're seeing that once every two minutes, on the back-end server, a new observe 'after save' handler is attached, without any corresponding call to removeObserver . (change-stream sends an end event after two minutes of inactivity, which our client responds to by re-connecting.)

We've also found that connecting NGINX to the back end Loopback for the specific change-stream API doesn't show this behavior.

The provided sandbox fork is an artificial test case of connecting 100 clients at once, to simulate the connections created over time.

The text was updated successfully, but these errors were encountered:

bajtos · 2020-03-10T14:44:03Z

Thank you @Wizarth for a detailed bug report.

It makes me wonder, is it possible to recreate your observed behavior in an automated LoopBack test? We already have tests in place to verify that "after save" hooks are removed when the change-stream is closed, see test/change-stream.test.js. These tests are calling createChangeStream directly from JavaScript, perhaps we need to add a test invoking the method via REST API?

Maybe the problem is specific to the way how Node.js, strong-remoting and/or loopback-connect-remote handles keep-alive connections?

I quickly checked our source code & git history, I that some part of this problem have been fixed by #3474 in July 2017 - I guess that explains why the problem does not manifest when you connect to your LoopBack backend via NGINX.

Wizarth · 2020-03-11T00:36:10Z

I was reading the source of strong-remoting, and found that json:true is set for the request.
I updated my test to set json:true as well, and found that both cases (direct connection and forwarding via another loopback instance) cause the zombie handlers. That narrows things down.

Edit: I haven't been able to narrow it down further, but I suspect the fault lies in strong-remoting/lib/http-context.js use of mux-demux.

Wizarth · 2020-03-23T02:02:41Z

I've narrowed it down to a unit test on strong-remoting, and have submitted an issue there.

Wizarth · 2020-04-08T05:32:26Z

Fixing strongloop/strong-remoting#482 solved the problem where directly connecting to the backend with json:true would leak handlers.

There remains an issue when forwarding - the handlers survive until the 2 minute change-stream timeout occurs (or some other timeout?), and are then cleaned up.

I think this is an issue in strong-remoting http-invocation - again needing to close the mux-demux instance. Since this does clean up after time, it's a lower priority for me to work on and I don't know if/when I'll be able to address it.

bajtos · 2020-05-11T06:25:30Z

Thank you @Wizarth for confirming which part of the problem is fixed now 👍

There remains an issue when forwarding - the handlers survive until the 2 minute change-stream timeout occurs (or some other timeout?), and are then cleaned up.

I think this is an issue in strong-remoting http-invocation - again needing to close the mux-demux instance. Since this does clean up after time, it's a lower priority for me to work on and I don't know if/when I'll be able to address it.

Let's close this GiHub issue as resolved then. If you ever get time to look into the remaining problem yourself, then feel free to go ahead and open a pull request to fix it, no need to open a new issue first.

Wizarth added the bug label Mar 10, 2020

bajtos added the major label Mar 10, 2020

dhmlau assigned bajtos Mar 13, 2020

Wizarth mentioned this issue Mar 23, 2020

Server stream not closed if requested as application/json strongloop/strong-remoting#482

Closed

bajtos closed this as completed May 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote change-stream causes zombie 'after save' handlers #4317

Remote change-stream causes zombie 'after save' handlers #4317

Wizarth commented Mar 10, 2020 •

edited

Loading

bajtos commented Mar 10, 2020

Wizarth commented Mar 11, 2020 •

edited

Loading

Wizarth commented Mar 23, 2020

Wizarth commented Apr 8, 2020

bajtos commented May 11, 2020

Remote change-stream causes zombie 'after save' handlers #4317

Remote change-stream causes zombie 'after save' handlers #4317

Comments

Wizarth commented Mar 10, 2020 • edited Loading

Steps to reproduce

Current Behavior

Expected Behavior

Link to reproduction sandbox

Additional information

bajtos commented Mar 10, 2020

Wizarth commented Mar 11, 2020 • edited Loading

Wizarth commented Mar 23, 2020

Wizarth commented Apr 8, 2020

bajtos commented May 11, 2020

Wizarth commented Mar 10, 2020 •

edited

Loading

Wizarth commented Mar 11, 2020 •

edited

Loading