-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hibernate couch_stream after each write #510
Conversation
In COUCHDB-1946 Adam Kocoloski investigated a memory explosion resulting from replication of databases with large attachments (npm fullfat). He was able to stabilize memory usage to a much lower level by hibernating couch_stream after each write. While this increases CPU utilization when writing attachments, it should help reduce memory utilization. This patch is the single change that affected a ~70% reduction in memory. No alteration to the spawn of couch_stream to change the fullsweep_after setting has been made, in part because this can be adjusted at the erl command line if desired (-erl ERL_FULLSWEEP_AFTER 0).
Reminder to myself that if we agree to do this in 2.x, we might want to do it in the 1.6.x branch as well. |
@@ -259,7 +259,7 @@ handle_call({write, Bin}, _From, Stream) -> | |||
buffer_len=0, | |||
md5=Md5_2, | |||
identity_md5=IdenMd5_2, | |||
identity_len=IdenLen + BinSize}}; | |||
identity_len=IdenLen + BinSize}, hibernate}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't understand the changes, why to use hibernate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@savanmorya There's a known issue with some uses of large (> 64bits) binaries when the process using them doesn't do much work. Here are two good writeups:
http:https://blog.bugsense.com/post/74179424069/erlang-binary-garbage-collection-a-lovehate
https://blog.heroku.com/logplex-down-the-rabbit-hole
The reason for hibernate is that it forces a full garbage collection where as erlang:garbage_collect(self())
won't always clean out all binary usage. As Fred says its the least ugly solution when its available and the processes are known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -259,7 +259,7 @@ handle_call({write, Bin}, _From, Stream) -> | |||
buffer_len=0, | |||
md5=Md5_2, | |||
identity_md5=IdenMd5_2, | |||
identity_len=IdenLen + BinSize}}; | |||
identity_len=IdenLen + BinSize}, hibernate}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@savanmorya There's a known issue with some uses of large (> 64bits) binaries when the process using them doesn't do much work. Here are two good writeups:
http:https://blog.bugsense.com/post/74179424069/erlang-binary-garbage-collection-a-lovehate
https://blog.heroku.com/logplex-down-the-rabbit-hole
The reason for hibernate is that it forces a full garbage collection where as erlang:garbage_collect(self())
won't always clean out all binary usage. As Fred says its the least ugly solution when its available and the processes are known.
In COUCHDB-1946 Adam Kocoloski investigated a memory explosion resulting from replication of databases with large attachments (npm fullfat). He was able to stabilize memory usage to a much lower level by hibernating couch_stream after each write. While this increases CPU utilization when writing attachments, it should help reduce memory utilization. This patch is the single change that affected a ~70% reduction in memory. No alteration to the spawn of couch_stream to change the fullsweep_after setting has been made, in part because this can be adjusted at the erl command line if desired (-erl ERL_FULLSWEEP_AFTER 0). +1 for 2.0.0 and 1.6.x from @davisp, see #510 for details.
+1 |
In COUCHDB-1946 Adam Kocoloski investigated a memory explosion resulting
from replication of databases with large attachments (npm fullfat). He
was able to stabilize memory usage to a much lower level by hibernating
couch_stream after each write. While this increases CPU utilization when
writing attachments, it should help reduce memory utilization.
This patch is the single change that affected a ~70% reduction in
memory.
No alteration to the spawn of couch_stream to change the fullsweep_after
setting has been made, in part because this can be adjusted at the erl
command line if desired (-erl ERL_FULLSWEEP_AFTER 0).
Testing recommendations
Replicate a database with a lot of attachments and observe memory usage with and without this patch.
JIRA issue number
COUCHDB-1946