Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

900MB OOM Error #888

Closed
anfractuosity opened this issue Jul 17, 2018 · 17 comments
Closed

900MB OOM Error #888

anfractuosity opened this issue Jul 17, 2018 · 17 comments

Comments

@anfractuosity
Copy link

Hi,

Today I got an OOM error, when it was killed at about 900MB if I'm reading
the logs correctly:

Using version:
mosquitto version 1.4.10 (build date Fri, 22 Dec 2017 08:19:25 +0000)

Jul 17 15:33:24 raspberrypi kernel: mosquitto invoked oom-killer: gfp_mask=0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null),  order=0, oom_score_adj=0
Jul 17 15:33:24 raspberrypi kernel: mosquitto cpuset=/ mems_allowed=0
Jul 17 15:33:24 raspberrypi kernel: CPU: 2 PID: 5117 Comm: mosquitto Tainted: G         C      4.14.52-v7+ #1123
Jul 17 15:33:24 raspberrypi kernel: Hardware name: BCM2835
Jul 17 15:33:24 raspberrypi kernel: [<8010ffd8>] (unwind_backtrace) from [<8010c240>] (show_stack+0x20/0x24)
Jul 17 15:33:24 raspberrypi kernel: [<8010c240>] (show_stack) from [<80785e84>] (dump_stack+0xd4/0x118)
Jul 17 15:33:24 raspberrypi kernel: [<80785e84>] (dump_stack) from [<80223f94>] (dump_header+0xac/0x208)
Jul 17 15:33:24 raspberrypi kernel: [<80223f94>] (dump_header) from [<802232fc>] (oom_kill_process+0x478/0x584)
Jul 17 15:33:24 raspberrypi kernel: [<802232fc>] (oom_kill_process) from [<80223c5c>] (out_of_memory+0x124/0x334)
Jul 17 15:33:24 raspberrypi kernel: [<80223c5c>] (out_of_memory) from [<80229718>] (__alloc_pages_nodemask+0x1060/0x11d8)
Jul 17 15:33:24 raspberrypi kernel: [<80229718>] (__alloc_pages_nodemask) from [<80274d28>] (new_slab+0x454/0x558)
Jul 17 15:33:24 raspberrypi kernel: [<80274d28>] (new_slab) from [<80276b6c>] (___slab_alloc.constprop.11+0x228/0x2c0)
Jul 17 15:33:24 raspberrypi kernel: [<80276b6c>] (___slab_alloc.constprop.11) from [<80276c48>] (__slab_alloc.constprop.10+0x44/0x90)
Jul 17 15:33:24 raspberrypi kernel: [<80276c48>] (__slab_alloc.constprop.10) from [<802773e0>] (kmem_cache_alloc+0x1f4/0x230)
Jul 17 15:33:24 raspberrypi kernel: [<802773e0>] (kmem_cache_alloc) from [<80674cfc>] (__alloc_skb+0x4c/0x144)
Jul 17 15:33:24 raspberrypi kernel: [<80674cfc>] (__alloc_skb) from [<806e6d70>] (sk_stream_alloc_skb+0x5c/0x1d0)
Jul 17 15:33:24 raspberrypi kernel: [<806e6d70>] (sk_stream_alloc_skb) from [<806e7c88>] (tcp_sendmsg_locked+0x6bc/0xe14)
Jul 17 15:33:24 raspberrypi kernel: [<806e7c88>] (tcp_sendmsg_locked) from [<806e8418>] (tcp_sendmsg+0x38/0x4c)
Jul 17 15:33:24 raspberrypi kernel: [<806e8418>] (tcp_sendmsg) from [<80714c78>] (inet_sendmsg+0x48/0xe4)
Jul 17 15:33:24 raspberrypi kernel: [<80714c78>] (inet_sendmsg) from [<8066d6c0>] (sock_sendmsg+0x24/0x34)
Jul 17 15:33:24 raspberrypi kernel: [<8066d6c0>] (sock_sendmsg) from [<8066d768>] (sock_write_iter+0x98/0xcc)
Jul 17 15:33:24 raspberrypi kernel: [<8066d768>] (sock_write_iter) from [<8028adc0>] (__vfs_write+0xe0/0x138)
Jul 17 15:33:24 raspberrypi kernel: [<8028adc0>] (__vfs_write) from [<8028afe0>] (vfs_write+0xb4/0x1bc)
Jul 17 15:33:24 raspberrypi kernel: [<8028afe0>] (vfs_write) from [<8028b230>] (SyS_write+0x54/0xb0)
Jul 17 15:33:24 raspberrypi kernel: [<8028b230>] (SyS_write) from [<80108060>] (ret_fast_syscall+0x0/0x28)
Jul 17 15:33:24 raspberrypi kernel: Mem-Info:
Jul 17 15:33:24 raspberrypi kernel: active_anon:112310 inactive_anon:112350 isolated_anon:0
                                     active_file:157 inactive_file:559 isolated_file:0
                                     unevictable:0 dirty:0 writeback:0 unstable:0
                                     slab_reclaimable:1654 slab_unreclaimable:2468
                                     mapped:1013 shmem:463 pagetables:724 bounce:0
                                     free:4490 free_pcp:0 free_cma:1226
Jul 17 15:33:24 raspberrypi kernel: Node 0 active_anon:449240kB inactive_anon:449400kB active_file:628kB inactive_file:2236kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:4052kB dirty:0kB writeback:0kB shmem:1852kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Jul 17 15:33:24 raspberrypi kernel: Normal free:17960kB min:16384kB low:20480kB high:24576kB active_anon:448940kB inactive_anon:449400kB active_file:556kB inactive_file:2192kB unevictable:0kB writepending:0kB present:970752kB managed:949452kB mlocked:0kB kernel_stack:880kB pagetables:2896kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:4904kB
Jul 17 15:33:24 raspberrypi kernel: lowmem_reserve[]: 0 0
Jul 17 15:33:24 raspberrypi kernel: Normal: 135*4kB (UMEC) 136*8kB (UMEC) 107*16kB (UMEC) 59*32kB (UE) 21*64kB (UE) 5*128kB (UE) 4*256kB (UEC) 3*512kB (UEC) 2*1024kB (U) 1*2048kB (U) 1*4096kB (C) = 17964kB
Jul 17 15:33:24 raspberrypi kernel: 1274 total pagecache pages
Jul 17 15:33:24 raspberrypi kernel: 21 pages in swap cache
Jul 17 15:33:24 raspberrypi kernel: Swap cache stats: add 27498, delete 27477, find 981/1293
Jul 17 15:33:24 raspberrypi kernel: Free swap  = 0kB
Jul 17 15:33:24 raspberrypi kernel: Total swap = 102396kB
Jul 17 15:33:24 raspberrypi kernel: 242688 pages RAM
Jul 17 15:33:24 raspberrypi kernel: 0 pages HighMem/MovableOnly
Jul 17 15:33:24 raspberrypi kernel: 5325 pages reserved
Jul 17 15:33:24 raspberrypi kernel: 2048 pages cma reserved
Jul 17 15:33:24 raspberrypi kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Jul 17 15:33:24 raspberrypi kernel: [  100]     0   100     3198      493      10       0       70             0 systemd-journal
Jul 17 15:33:24 raspberrypi kernel: [  132]     0   132     3584       15       8       0      135         -1000 systemd-udevd
Jul 17 15:33:24 raspberrypi kernel: [  263]     0   263     1324       15       6       0       39             0 cron
Jul 17 15:33:24 raspberrypi kernel: [  268] 65534   268     1324        5       6       0       55             0 thd
Jul 17 15:33:24 raspberrypi kernel: [  272]   105   272     1626        2       7       0      104          -900 dbus-daemon
Jul 17 15:33:24 raspberrypi kernel: [  288]     0   288     1845       21       7       0       89             0 systemd-logind
Jul 17 15:33:24 raspberrypi kernel: [  292]   110   292     1667       70       8       0       62             0 avahi-daemon
Jul 17 15:33:24 raspberrypi kernel: [  305]     0   305     5937        1      11       0      140             0 rsyslogd
Jul 17 15:33:24 raspberrypi kernel: [  308]   110   308     1601        5       6       0       72             0 avahi-daemon
Jul 17 15:33:24 raspberrypi kernel: [  370]     0   370      539       13       5       0       16             0 babeld
Jul 17 15:33:24 raspberrypi kernel: [  438]     0   438     2538       15       9       0      114             0 wpa_supplicant
Jul 17 15:33:24 raspberrypi kernel: [  482]     0   482      524        0       5       0       34             0 hciattach
Jul 17 15:33:24 raspberrypi kernel: [  487]     0   487     1769        0       8       0       76             0 bluetoothd
Jul 17 15:33:24 raspberrypi kernel: [  557]     0   557      737       14       6       0       88             0 dhcpcd
Jul 17 15:33:24 raspberrypi kernel: [  567]     0   567      458        0       5       0       16             0 autossh
Jul 17 15:33:24 raspberrypi kernel: [  571]     0   571     2342       26       9       0      184             0 ssh
Jul 17 15:33:24 raspberrypi kernel: [  578]     0   578     1470        2       6       0      117             0 login
Jul 17 15:33:24 raspberrypi kernel: [  587]     0   587     2552       11       8       0      129         -1000 sshd
Jul 17 15:33:24 raspberrypi kernel: [  590]   107   590     1971       35       7       0       73             0 ntpd
Jul 17 15:33:24 raspberrypi kernel: [  610]  1000   610     2416        0       8       0      180             0 systemd
Jul 17 15:33:24 raspberrypi kernel: [  614]  1000   614     2810       12       8       0      287             0 (sd-pam)
Jul 17 15:33:24 raspberrypi kernel: [  619]  1000   619     1507        2       8       0      302             0 bash
Jul 17 15:33:24 raspberrypi kernel: [ 3807]     0  3807     2882        2      10       0      183             0 sshd
Jul 17 15:33:24 raspberrypi kernel: [ 3822]  1000  3822     2916        0      10       0      192             0 sshd
Jul 17 15:33:24 raspberrypi kernel: [ 3825]  1000  3825     1502        2       8       0      297             0 bash
Jul 17 15:33:24 raspberrypi kernel: [ 4786]     0  4786     2882        2      10       0      183             0 sshd
Jul 17 15:33:24 raspberrypi kernel: [ 4795]  1000  4795     2882        0      10       0      189             0 sshd
Jul 17 15:33:24 raspberrypi kernel: [ 4798]  1000  4798     1490       33       8       0      253             0 bash
Jul 17 15:33:24 raspberrypi kernel: [ 5117]   108  5117   247113   223924     487       0    21061             0 mosquitto
Jul 17 15:33:24 raspberrypi kernel: [ 5233]  1000  5233     1338      118       6       0        0             0 gwlogger
Jul 17 15:33:24 raspberrypi kernel: Out of memory: Kill process 5117 (mosquitto) score 933 or sacrifice child
Jul 17 15:33:24 raspberrypi kernel: Killed process 5117 (mosquitto) total-vm:988452kB, anon-rss:894516kB, file-rss:1180kB, shmem-rss:0kB
Jul 17 15:33:24 raspberrypi kernel: oom_reaper: reaped process 5117 (mosquitto), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Cheers
Chris

@karlp
Copy link
Contributor

karlp commented Jul 17, 2018

I presume you weren't expecting it? It's no problem to get a lot of memory consumption with a lot of queued messages and offline subscribers for instance. I highly recommend at least 1.4.15 (the last 1.4) or 1.5, which all have known memory and performance fixes. If you are still finding this to be unexpected, you'll need to elaborate on what your scenario is and what you are expecting.

@anfractuosity
Copy link
Author

anfractuosity commented Jul 17, 2018

Hmm, I wasn't expecting it to continue to consume that much memory. If when the internet goes down, will it continue to buffer data in memory indefinitely?

My setup is:

multiple pis (publishing MQTT messages) ---- pi (broker1) connected to Internet -------- bridge -------- broker2 

I'm assuming the issue may have occurred if broker1 lost connectivity with the Internet possibly? So that it would start storing the messages in memory?

Is it possible instead of consuming memory to flush the message straight to disk instead? Or flush to disk when the memory usage is above X MB?

@karlp
Copy link
Contributor

karlp commented Jul 17, 2018

yes and no. persistence at present is simply having the in memory db written to disk perdioically, it won't reduce your ram at all. (And you wouldn't really want it, you'd just have to be churning the disk reading little pieces all the time so you weren't "using ram")

No, there's no magic knobs for "I prefer disk io to ram usage" or thresholds for that. I still highly recommend running a more recent version for starters.

@toast-uz
Copy link
Contributor

I guess it is not a bug of mosquitto. You need a tips of OS memory management. OOM error comes from the memory overcommitted. You can get some hint by Googling "memory overcommit".

@toast-uz
Copy link
Contributor

Or could you resize your swap? Anyway, I guess it should be solved by optimizing your OS. You can get more useful comments at any OS forum.

@anfractuosity
Copy link
Author

It looks like max_queued_messages and max_queued_bytes solve my issue. It would be nice to have the ability to set a global threshold too, as well as per client.

@karlp
Copy link
Contributor

karlp commented Jul 17, 2018

@anfractuosity that would be #100

@toast-uz
Copy link
Contributor

I guess @anfractuosity 's situation can be resolved by max_queued_... without #100 . Because his broker's uplink is only one bridge.

@anfractuosity
Copy link
Author

anfractuosity commented Jul 19, 2018

I have an arbitrary number of clients behind the bridge though. Doesn't that mean there are multiple clients, so a global value could be useful? I'm assuming each of those clients has a separate queue at the bridge, before forwarding?

@toast-uz
Copy link
Contributor

Is your issue at pi (broker1) ? If so, there is only one queue for the uplink bridge context, although there are a lot of clients behind the bridge target.

@anfractuosity
Copy link
Author

Sorry I'm a little confused.

Yeah the issue was at broker 1.

I was under the impression the broker has queues for each client?

"Currently (1.1.x) mosquitto broker allows configuration of the maximum number of outstanding messages that are queued per client" seems to indicate there are multiple queues?

@toast-uz
Copy link
Contributor

Your issue is not at receiving messages from a lot of client to the broker 1 (multiple queues), but at sending messages from the broker 1 (one queue) to the one bridge target, right? I think to control one sending queue will be a solution for your issue.

@anfractuosity
Copy link
Author

Thanks for the info. I'm not sure though, which queue 'max_queued_messages' controls.

I'll have a look at the code when I have some time.

I'll do some more analysis too soon, as I'm gonna setup a small testbed and can then play with different queue values.

Cheers

@karlp
Copy link
Contributor

karlp commented Jul 22, 2018

queues are per client, and there are only limits on those queues. #100 is a feature request to add limits that are broker wide as well.

@anfractuosity
Copy link
Author

ah, gotcha, cheers

@toast-uz
Copy link
Contributor

@anfractuosity could you close this issue? Your issue is definitely not a bug for mosquitto. This thread became change into the consultation for your environment. If you want to continue to discuss, use mosquitto-dev ML.

@anfractuosity
Copy link
Author

Sure yeah

@lock lock bot locked as resolved and limited conversation to collaborators Aug 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants