New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leaks #2057
Comments
Could you give some details of what you were doing when this happened? I keep a good eye on valgrind, so I'm keen to find how this could be happening. |
I'm not doing anything special. I just started a server with one connection on port 1883 and a dozen connections on TLS 8883. Connections to 8883 mostly come from Home Assistant and a few other MQTT brokers. As I wrote before. If I find, I will try to precisely determine what causes the problems that valgrind reports. |
I've just tried a single mosquitto_sub connected to 1883, plus a dozen to 8883, with this config:
My valgrind log was this:
It looks like you are using a plugin, is that correct? It really is important to know as many details as possible about what you are doing so I can properly reproduce it. I could be making an assumption about configuration that is different to what you have. |
Yes, I have a dedicated plugin that I wrote. However, it is separately tested with valgrind and I did not detect any leaks there. I tried to remove everything from my plugin and left the following code:
Compilation: make all Building target: libsupla-mosquitto-auth-plugin.so /etc/mosquitto/mosquitto.conf
/etc/mosquitto/conf.d/supla.conf
Mosquitto built with the following parameters: make all WITH_MEMORY_TRACKING=no WITH_BRIDGE=no WITH_DOCS=no WITH_CJSON=no The result of valgrind:
Test carried out with considerably fewer connections (only 2). In this case, you can no longer see "definitely lost" but this may be related to less traffic. |
Linux **** 4.19.0-5-cloud-amd64 #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19) x86_64 GNU/Linux |
The dlopen related blocks should be fixed, but they aren't critical due to just being a lack of freeing on exit. The |
I am aware of this, but it is more difficult to catch the right ones when there is a large number of logs. Leaks appear in my environment quite quickly as soon as I let in users. I will try to find the cause of the problem. Alternatively, I will try to dump the data that is sent to and from clients to a file. |
You could always generate a suppression file for the dlopen functions: |
Aside from topic... Did you run mosquitto with any fuzzer, for example https://lcamtuf.coredump.cx/afl/ ? |
It's not something I have done, but others have done and found the odd potential memory leak. I'd be keen to have a fuzzing setup if someone wanted to help with that... :) |
I use AFL in my projects. I use this fuzzer to detect possible security vulnerabilities. |
I hope that this lot will make their tool available: https://www.mdpi.com/1424-8220/20/18/5194 |
In my project, I extracted the logic responsible for handling the protocol into a separate project, which instead of reading data from the tcp socket, it reads it from the file. The compiled binary allows you to indicate the path to the file to which AFL generates mutated data. Of course, this doesn't test everything, but the essential part. For chamfering to be effective, the tested process must start very quickly. Then I run several simultaneous tasks on a powerful machine. In my case, the command line looks like this
https://github.com/SUPLA/supla-core/blob/master/supla-afl/src/supla-afl.cpp I think in your case you would have to build a separate project that uses libmosquitto except that libmosquitto would have to be read from a file instead of a socket. |
I was able to trace a TCP connection which is causing the leak below (all plugins off). I will try to dump the data that is being sent.
|
OK, I was able to reproduce the leak reported by valgrind.
all plugins off Local broker configuration:
The local broker keeps reconnecting because authorization fails. Stopping the master broker causes valgrind to report a leak.
In addition, I found minor leaks that are not dangerous but it is worth the broker initiation to be organized a little differently. In this case, if there is already a running process that blocks the ports on which the broker is to listen, it will exit the program with a leak as below.
|
Signed-off-by: Przemek Zygmunt <[email protected]>
By running the server with valgrind you can find some memory leaks. I haven't looked at it closely yet. Perhaps it is just unclean memory when closing the process, but even if so, it is worth cleaning up to make it easier to track leaks that may actually occur. If I find a moment, I'll try to fix it.
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes ./mosquitto -c /etc/mosquitto/mosquitto.conf
==9570== Memcheck, a memory error detector
==9570== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9570== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==9570== Command: src/mosquitto -c /etc/mosquitto/mosquitto.conf
==9570==
^C==9570==
==9570== HEAP SUMMARY:
==9570== in use at exit: 140,570 bytes in 354 blocks
==9570== total heap usage: 117,182 allocs, 116,828 frees, 87,270,794 bytes allocated
==9570==
==9570== 15 bytes in 1 blocks are still reachable in loss record 1 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x4DE4DB9: strdup (strdup.c:42)
==9570== by 0x114017: context__init (context.c:76)
==9570== by 0x11D1F4: net__socket_accept (net.c:185)
==9570== by 0x11CE61: mux_epoll__handle (mux_epoll.c:215)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570==
==9570== 16 bytes in 1 blocks are still reachable in loss record 2 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x40118B7: allocate_dtv_entry (dl-tls.c:582)
==9570== by 0x40118B7: allocate_and_init (dl-tls.c:607)
==9570== by 0x40118B7: tls_get_addr_tail.isra.0 (dl-tls.c:787)
==9570== by 0x40171F7: __tls_get_addr (tls_get_addr.S:55)
==9570== by 0x55ACC23: ???
==9570== by 0x559E2C6: ???
==9570== by 0x4014174: _dl_close_worker (dl-close.c:288)
==9570== by 0x4014174: _dl_close_worker (dl-close.c:111)
==9570== by 0x401486D: _dl_close (dl-close.c:842)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4E91BBE: _dl_catch_error (dl-error-skeleton.c:215)
==9570== by 0x484F974: _dlerror_run (dlerror.c:163)
==9570== by 0x484F363: dlclose (dlclose.c:46)
==9570== by 0x12994D: security__module_cleanup_single.isra.0 (security.c:439)
==9570==
==9570== 19 bytes in 1 blocks are still reachable in loss record 3 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x1201A8: packet__read_binary (packet_datatypes.c:110)
==9570== by 0x12026E: packet__read_string (packet_datatypes.c:128)
==9570== by 0x1195B2: handle__connect (handle_connect.c:617)
==9570== by 0x120B19: packet__read (packet_mosq.c:515)
==9570== by 0x11CF04: loop_handle_reads_writes (mux_epoll.c:298)
==9570== by 0x11CF04: mux_epoll__handle (mux_epoll.c:210)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570==
==9570== 33 bytes in 1 blocks are still reachable in loss record 4 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x1201A8: packet__read_binary (packet_datatypes.c:110)
==9570== by 0x11961D: handle__connect (handle_connect.c:641)
==9570== by 0x120B19: packet__read (packet_mosq.c:515)
==9570== by 0x11CF04: loop_handle_reads_writes (mux_epoll.c:298)
==9570== by 0x11CF04: mux_epoll__handle (mux_epoll.c:210)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570== 64 bytes in 2 blocks are still reachable in loss record 5 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x40141BD: _dl_close_worker (dl-close.c:396)
==9570== by 0x40141BD: _dl_close_worker (dl-close.c:111)
==9570== by 0x401486D: _dl_close (dl-close.c:842)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4E91BBE: _dl_catch_error (dl-error-skeleton.c:215)
==9570== by 0x484F974: _dlerror_run (dlerror.c:163)
==9570== by 0x484F363: dlclose (dlclose.c:46)
==9570== by 0x12994D: security__module_cleanup_single.isra.0 (security.c:439)
==9570== by 0x12A4B6: mosquitto_security_module_cleanup (security.c:452)
==9570== by 0x10E171: main (mosquitto.c:607)
==9570==
==9570== 73 bytes in 2 blocks are still reachable in loss record 6 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x401AAD9: strdup (strdup.c:42)
==9570== by 0x4016066: _dl_load_cache_lookup (dl-cache.c:317)
==9570== by 0x4008D0A: _dl_map_object (dl-load.c:2332)
==9570== by 0x400D291: openaux (dl-deps.c:64)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x400D605: _dl_map_object_deps (dl-deps.c:248)
==9570== by 0x401304F: dl_open_worker (dl-open.c:271)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570==
==9570== 73 bytes in 2 blocks are still reachable in loss record 7 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x400B58F: _dl_new_object (dl-object.c:163)
==9570== by 0x4005E47: _dl_map_object_from_fd (dl-load.c:1001)
==9570== by 0x4008A8C: _dl_map_object (dl-load.c:2466)
==9570== by 0x400D291: openaux (dl-deps.c:64)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x400D605: _dl_map_object_deps (dl-deps.c:248)
==9570== by 0x401304F: dl_open_worker (dl-open.c:271)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570==
==9570== 264 bytes in 1 blocks are still reachable in loss record 8 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x4012877: add_to_global (dl-open.c:104)
==9570== by 0x40134AF: dl_open_worker (dl-open.c:522)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4E91BBE: _dl_catch_error (dl-error-skeleton.c:215)
==9570== by 0x484F974: _dlerror_run (dlerror.c:163)
==9570== by 0x484F2E5: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==9570== by 0x12A120: security__module_init_single.isra.1 (security.c:311)
==9570== by 0x10DF8D: main (mosquitto.c:530)
==9570==
==9570== 624 bytes in 1 blocks are still reachable in loss record 9 of 18
==9570== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==9570== by 0x113C0F: context__init (context.c:40)
==9570== by 0x11D1F4: net__socket_accept (net.c:185)
==9570== by 0x11CE61: mux_epoll__handle (mux_epoll.c:215)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570== 1,174 bytes in 84 blocks are indirectly lost in loss record 10 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x4DE4DB9: strdup (strdup.c:42)
==9570== by 0x114017: context__init (context.c:76)
==9570== by 0x11D1F4: net__socket_accept (net.c:185)
==9570== by 0x11CE61: mux_epoll__handle (mux_epoll.c:215)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570==
==9570== 1,680 bytes in 2 blocks are still reachable in loss record 11 of 18
==9570== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==9570== by 0x4010A1F: _dl_check_map_versions (dl-version.c:274)
==9570== by 0x4013095: dl_open_worker (dl-open.c:277)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4E91BBE: _dl_catch_error (dl-error-skeleton.c:215)
==9570== by 0x484F974: _dlerror_run (dlerror.c:163)
==9570== by 0x484F2E5: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==9570== by 0x12A120: security__module_init_single.isra.1 (security.c:311)
==9570== by 0x10DF8D: main (mosquitto.c:530)
==9570==
==9570== 2,198 bytes in 84 blocks are indirectly lost in loss record 12 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x1201A8: packet__read_binary (packet_datatypes.c:110)
==9570== by 0x12026E: packet__read_string (packet_datatypes.c:128)
==9570== by 0x1195B2: handle__connect (handle_connect.c:617)
==9570== by 0x120B19: packet__read (packet_mosq.c:515)
==9570== by 0x11CF04: loop_handle_reads_writes (mux_epoll.c:298)
==9570== by 0x11CF04: mux_epoll__handle (mux_epoll.c:210)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570==
==9570== 2,381 bytes in 2 blocks are still reachable in loss record 13 of 18
==9570== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==9570== by 0x400B2AD: _dl_new_object (dl-object.c:73)
==9570== by 0x4005E47: _dl_map_object_from_fd (dl-load.c:1001)
==9570== by 0x4008A8C: _dl_map_object (dl-load.c:2466)
==9570== by 0x400D291: openaux (dl-deps.c:64)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x400D605: _dl_map_object_deps (dl-deps.c:248)
==9570== by 0x401304F: dl_open_worker (dl-open.c:271)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570==
==9570== 2,772 bytes in 84 blocks are indirectly lost in loss record 14 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x1201A8: packet__read_binary (packet_datatypes.c:110)
==9570== by 0x11961D: handle__connect (handle_connect.c:641)
==9570== by 0x120B19: packet__read (packet_mosq.c:515)
==9570== by 0x11CF04: loop_handle_reads_writes (mux_epoll.c:298)
==9570== by 0x11CF04: mux_epoll__handle (mux_epoll.c:210)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570== 4,064 bytes in 1 blocks are still reachable in loss record 15 of 18
==9570== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==9570== by 0x400A096: do_lookup_unique (dl-lookup.c:253)
==9570== by 0x400A096: do_lookup_x (dl-lookup.c:528)
==9570== by 0x400A38E: _dl_lookup_symbol_x (dl-lookup.c:814)
==9570== by 0x400BD5D: elf_machine_rela (dl-machine.h:308)
==9570== by 0x400BD5D: elf_dynamic_do_Rela (do-rel.h:137)
==9570== by 0x400BD5D: _dl_relocate_object (dl-reloc.c:258)
==9570== by 0x401319D: dl_open_worker (dl-open.c:377)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4E91BBE: _dl_catch_error (dl-error-skeleton.c:215)
==9570== by 0x484F974: _dlerror_run (dlerror.c:163)
==9570== by 0x484F2E5: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==9570==
==9570== 21,216 bytes in 34 blocks are indirectly lost in loss record 16 of 18
==9570== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==9570== by 0x113C0F: context__init (context.c:40)
==9570== by 0x11D1F4: net__socket_accept (net.c:185)
==9570== by 0x11CE61: mux_epoll__handle (mux_epoll.c:215)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570==
==9570== 58,560 (31,200 direct, 27,360 indirect) bytes in 50 blocks are definitely lost in loss record 17 of 18
==9570== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==9570== by 0x113C0F: context__init (context.c:40)
==9570== by 0x11D1F4: net__socket_accept (net.c:185)
==9570== by 0x11CE61: mux_epoll__handle (mux_epoll.c:215)
==9570== by 0x11C371: mosquitto_main_loop (loop.c:177)
==9570== by 0x10E088: main (mosquitto.c:565)
==9570==
==9570== 72,704 bytes in 1 blocks are still reachable in loss record 18 of 18
==9570== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9570== by 0x5432455: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==9570== by 0x400F379: call_init.part.0 (dl-init.c:72)
==9570== by 0x400F475: call_init (dl-init.c:118)
==9570== by 0x400F475: _dl_init (dl-init.c:119)
==9570== by 0x40132D2: dl_open_worker (dl-open.c:517)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4012BB9: _dl_open (dl-open.c:599)
==9570== by 0x484F255: dlopen_doit (dlopen.c:66)
==9570== by 0x4E91B2E: _dl_catch_exception (dl-error-skeleton.c:196)
==9570== by 0x4E91BBE: _dl_catch_error (dl-error-skeleton.c:215)
==9570== by 0x484F974: _dlerror_run (dlerror.c:163)
==9570== by 0x484F2E5: dlopen@@GLIBC_2.2.5 (dlopen.c:87)
==9570==
==9570== LEAK SUMMARY:
==9570== definitely lost: 31,200 bytes in 50 blocks
==9570== indirectly lost: 27,360 bytes in 286 blocks
==9570== possibly lost: 0 bytes in 0 blocks
==9570== still reachable: 82,010 bytes in 18 blocks
==9570== suppressed: 0 bytes in 0 blocks
==9570==
==9570== For counts of detected and suppressed errors, rerun with: -v
==9570== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==9570== could not unlink /tmp/vgdb-pipe-from-vgdb-to-9570-by-pzygmunt-on-???
==9570== could not unlink /tmp/vgdb-pipe-to-vgdb-from-9570-by-pzygmunt-on-???
==9570== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-9570-by-pzygmunt-on-???
The text was updated successfully, but these errors were encountered: