Hibernate couch_stream after each write #510

wohali · 2017-05-09T02:40:28Z

In COUCHDB-1946 Adam Kocoloski investigated a memory explosion resulting
from replication of databases with large attachments (npm fullfat). He
was able to stabilize memory usage to a much lower level by hibernating
couch_stream after each write. While this increases CPU utilization when
writing attachments, it should help reduce memory utilization.

This patch is the single change that affected a ~70% reduction in
memory.

No alteration to the spawn of couch_stream to change the fullsweep_after
setting has been made, in part because this can be adjusted at the erl
command line if desired (-erl ERL_FULLSWEEP_AFTER 0).

Testing recommendations

Replicate a database with a lot of attachments and observe memory usage with and without this patch.

JIRA issue number

COUCHDB-1946

In COUCHDB-1946 Adam Kocoloski investigated a memory explosion resulting from replication of databases with large attachments (npm fullfat). He was able to stabilize memory usage to a much lower level by hibernating couch_stream after each write. While this increases CPU utilization when writing attachments, it should help reduce memory utilization. This patch is the single change that affected a ~70% reduction in memory. No alteration to the spawn of couch_stream to change the fullsweep_after setting has been made, in part because this can be adjusted at the erl command line if desired (-erl ERL_FULLSWEEP_AFTER 0).

wohali · 2017-05-09T02:43:08Z

Reminder to myself that if we agree to do this in 2.x, we might want to do it in the 1.6.x branch as well.

MITsVision · 2017-05-09T13:03:57Z

src/couch/src/couch_stream.erl

@@ -259,7 +259,7 @@ handle_call({write, Bin}, _From, Stream) ->
 buffer_len=0,
 md5=Md5_2,
 identity_md5=IdenMd5_2,
- identity_len=IdenLen + BinSize}};
+ identity_len=IdenLen + BinSize}, hibernate};


i don't understand the changes, why to use hibernate

@savanmorya There's a known issue with some uses of large (> 64bits) binaries when the process using them doesn't do much work. Here are two good writeups:

http:https://blog.bugsense.com/post/74179424069/erlang-binary-garbage-collection-a-lovehate
https://blog.heroku.com/logplex-down-the-rabbit-hole

The reason for hibernate is that it forces a full garbage collection where as erlang:garbage_collect(self()) won't always clean out all binary usage. As Fred says its the least ugly solution when its available and the processes are known.

davisp

+1

davisp · 2017-05-09T15:01:02Z

src/couch/src/couch_stream.erl

@@ -259,7 +259,7 @@ handle_call({write, Bin}, _From, Stream) ->
 buffer_len=0,
 md5=Md5_2,
 identity_md5=IdenMd5_2,
- identity_len=IdenLen + BinSize}};
+ identity_len=IdenLen + BinSize}, hibernate};


@savanmorya There's a known issue with some uses of large (> 64bits) binaries when the process using them doesn't do much work. Here are two good writeups:

http:https://blog.bugsense.com/post/74179424069/erlang-binary-garbage-collection-a-lovehate
https://blog.heroku.com/logplex-down-the-rabbit-hole

The reason for hibernate is that it forces a full garbage collection where as erlang:garbage_collect(self()) won't always clean out all binary usage. As Fred says its the least ugly solution when its available and the processes are known.

@davisp

In COUCHDB-1946 Adam Kocoloski investigated a memory explosion resulting from replication of databases with large attachments (npm fullfat). He was able to stabilize memory usage to a much lower level by hibernating couch_stream after each write. While this increases CPU utilization when writing attachments, it should help reduce memory utilization. This patch is the single change that affected a ~70% reduction in memory. No alteration to the spawn of couch_stream to change the fullsweep_after setting has been made, in part because this can be adjusted at the erl command line if desired (-erl ERL_FULLSWEEP_AFTER 0). +1 for 2.0.0 and 1.6.x from @davisp, see #510 for details.

janl · 2017-05-10T19:06:22Z

Reminder to myself that if we agree to do this in 2.x, we might want to do it in the 1.6.x branch as well.

+1

wohali · 2017-05-10T23:09:40Z

@janl Already done! f073391

wohali requested review from kocolosk and davisp May 9, 2017 02:41

wohali added enhancement replication labels May 9, 2017

MITsVision reviewed May 9, 2017

View reviewed changes

davisp approved these changes May 9, 2017

View reviewed changes

wohali merged commit 7c3aef6 into master May 9, 2017

janl deleted the 1943-attachment-perf branch May 10, 2017 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hibernate couch_stream after each write #510

Hibernate couch_stream after each write #510

wohali commented May 9, 2017

wohali commented May 9, 2017

MITsVision May 9, 2017

davisp May 9, 2017

davisp left a comment

davisp May 9, 2017

janl commented May 10, 2017

wohali commented May 10, 2017

Hibernate couch_stream after each write #510

Hibernate couch_stream after each write #510

Conversation

wohali commented May 9, 2017

Testing recommendations

JIRA issue number

wohali commented May 9, 2017

MITsVision May 9, 2017

Choose a reason for hiding this comment

davisp May 9, 2017

Choose a reason for hiding this comment

davisp left a comment

Choose a reason for hiding this comment

davisp May 9, 2017

Choose a reason for hiding this comment

janl commented May 10, 2017

wohali commented May 10, 2017