Release v1.41.0 · netdata/netdata

Checkout the v1.41 release meetup recording or read on to learn more about the new UI and other features in this release.

Netdata Growth
Release Highlights
Acknowledgements
Contributions
Collectors
Documentation
Packaging/Installation
Health
Exporting
Other Notable Changes
Deprecation notice
- Deprecated in this release
Netdata Release Meetup
Support options

Steady to our schedule, this is another great Netdata release!

Netdata Growth

64 k GitHub Stars ⭐
1.7 M monitored nodes
570+ M docker hub pulls

Give Netdata a ⭐ too, on Github!

❤️ Thank you for your love! 🚀 You rock!

Release Highlights

New Agent Dashboard

Netdata Agents and Parents now have a new UI!

New CHARTS 🟢 New SUMMARIES 🟢 MACHINE-LEARNING FIRST 🟢 INFRASTRUCTURE LEVEL DASHBOARDS 🟢 FILTER, SLICE, and DICE any dataset 🟢 ANOMALY ADVISOR 🟢 METRICS CORRELATIONS 🟢 NETDATA FUNCTIONS 🟢 EVENTS FEED 🟢 HEATMAPS 🟢

In the last few months, we have ported and open-sourced all Netdata Cloud APIs to the Netdata Agent, allowing Netdata Parents to drive the same multi-node / infrastructure level dashboards Netdata Cloud provides!

So, as of today, Netdata Agents and Parents present the same UI, exactly the same dashboard, charts and features with Netdata Cloud!

Single Node Dashboard Changes

Apart from the entirely new look, single-node dashboards now group similar charts together. So, all disk drives, network interfaces, cgroups (containers and VMs), are now a single set of charts.

This allows Netdata to aggregate a vast amount of datasets in a chart, like the following, where almost 20k containers are now manageable:

To make it easier for you to navigate, filter, slice, and dice the data, the menus above each chart give you easy access to all the data of the chart:

Multi Node Dashboards

When Netdata Agents are configured as Parents (multiple other agents stream metrics to them), they now present multi-node and multi-instance charts. At the top right corner of the dashboard, there is the global nodes filter, from which you can slice the entire dashboard for one or a few of your nodes.

Want to know more?

Get a firsthand walkthrough with Costa Tsaousis, Netdata's Founder, on the rationale for this change and the path Netdata is taking by checking the video from Netdata Office Hours on YouTube.

The old dashboards are still accessible

You can still access all versions of the dashboards, as follows:

https://your.server:19999/
The default dashboard is now a live version of the new UI. The dashboard static files are served by Cloudflare and are automatically updated when we release a new version of the UI, so that your Netdata agent is always up to date.
https://your.server:19999/v2/
A local copy of the latest dashboard, as it was at the time the agent was released. This is distributed with Netdata under the Netdata Cloud UI License v1.0. The local copy is automatically used if for any reason the web browser cannot download the live version of it.
https://your.server:19999/v1/
The previous single-node version of the Netdata Agent dashboard.
https://your.server:19999/v0/
The now ancient, original version of the Netdata Agent dashboard.

Netdata Assistant

Netdata Assistant: Your AI-Powered Troubleshooting Sidekick

The Netdata Assistant is an AI-powered tool that uses large language models and our community's knowledge to guide you during troubleshooting and help you get to the root cause sooner.

The goal of the Netdata Assistant is straightforward: to make your troubleshooting process easier. It's here to save you from the hassle of sifting through tons of information so you can focus on solving the problem at hand.

It will give you the lowdown on the alert, why it's happening, and why you should care. It'll also guide you on how to troubleshoot it and even offer some handy web links for more info if you're interested.

Read more about it on the Netdata blog here.

New FreeIPMI collector for monitoring enterprise hardware

Netdata got a new FreeIPMI collector. The new collector is able to collect IPMI sensors at a much better data collection rate, and it is more reliable and robust compared to the previous one.

We have also categorized all sensors based on the component they monitor:

And provided as labels the exact sensor name each metric refers to:

Netdata Detects FDs Leaking

"FD" stands for "file descriptor". A file descriptor is an integer that the operating system assigns to an open file to track it. This includes regular data files, directories, network sockets, pipes, and other types of I/O streams.

In Linux, everything is treated as a file, which includes hardware devices, directories, and sockets. Each open file is assigned a file descriptor. When a file is closed, its file descriptor is freed up for reuse. However, if an application doesn't close a file when it's done with it, that's called a "file descriptor leak".

File descriptor leaks can cause several problems:

Resource exhaustion: Each process has a limit to the number of file descriptors it can open. If a process continually leaks file descriptors without closing them, it will eventually hit this limit and won't be able to open any more files, which often causes the process to crash.
Unexpected behavior: Open file descriptors hold resources, like network sockets, that might be expected to be available for other uses. If these resources are tied up due to a leak, it can cause unexpected behavior.
Security issues: File descriptors can sometimes be used to gain unauthorized access to data if they're not properly managed.

apps.plugins is now able to track the usage of FDs against the limits set for each application. We have added an fds category in the Applications section of the dashboard. The first chart shows the percentage of FDs used by each application against its limits:

Acknowledgements

We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product.

@k0ste for improving Prometheus exporting doc.
@carlocab for replacing info macro with a less generic name.
@MYanello for updating the pfSense package installation instructions.

Contributions

Collectors

Improvements

Improve of fds monitoring (apps.plugin) (#15437, @ktsaou)
Add application groups file descriptor limit monitoring (apps.plugin) (#15417, @ktsaou)
Re-create sdr cache on start (freeipmi.plugin) (#15361, @ktsaou)
Add sensor state chart, create a per-sensor chart instead of a per-sensor dimension (freeipmi.plugin) (#15327, @ktsaou)
Expose CmdLine in apps function (apps.plugin) (#15275, @ilyam8)
Remove pod_uid and container_id labels in k8s (cgroups.plugin) (#15216, @ilyam8)
Add cluster mode (go.d/elasticsearch) (#1227, @ilyam8)
Add 'fallback_type' config option to match Untyped (go.d/prometheus) (#1225, @ilyam8)

Bug fixes

Fix sensor state updates (freeipmi.plugin) (#15360, @ilyam8)
Fix tc.plugin charts labels (tc.plugin) (#15262, @ilyam8)
Fix collecting hostgroup from stats_mysql_connection_pool (go.d/proxysql) (#1226, @ilyam8)

Other

Add eBPF Functions to enable/disable threads (ebpf.plugin) (#15214, @thiagoftsm)
Hide eBPF functions (ebpf.plugin) (#15404, @thiagoftsm)
Add profile.plugin (#13962, @vkalintiris)

Documentation

Add link for netdata cloud and sign-in cta (#15431, @andrewm4894)
Update Netdata logo in README.md (#15424, @christophidesp)
Fix a typo in health.d/consul.conf (#15419, @Ancairon)
Add reference to CNCF (#15408, @hugovalente-pm)
Fix instructions on how to determine which installation method to use (#15351, @hugovalente-pm)
Update the default Docker installation to provide the full feature set (#15339, @ilyam8)
Fix swapped use of volume/bind mount in Docker readme (#15298, @Ancairon)
Add Streaming and replication doc (#15297, @Ancairon)
Update "health enabled by default" description in stream.conf (#15291, @ilyam8)
Remove extra parenthesis from doc (#15290, @Ancairon)
Merge spaces, war rooms and invite your team to one place (#15289, @hugovalente-pm)
Fix mistype for 'send automatic labels' Prometheus option (#15282, @k0ste)
Small readme improvements (#15270, @andrewm4894)
Update pfsense.md package install instructions (#15250, @MYanello)
Add RocketChat cloud integration docs (#15205, @car12o)

Packaging / Installation

Update v2 dashboard to v6.21.3 (#15448, @ilyam8)
Fix arch detection in static install update (#15396, @ilyam8)
Add missing files to web/gui/Makefile.am. (#15383, @Ferroin)
Build optimizations (#15381, @tkatsoulas)
Update libbpf to v1.2.2 (#15373, @thiagoftsm)
Update go.d.plugin to v0.54.0 (#15312, @ilyam8)
Only try to enable _FORTIFY_SOURCE if the user has not disabled optimizations (#15284, @Ferroin)
Assorted kickstart script improvements (#15243, @Ferroin)
Fix file permissions under directory (#15208, @stelfrag)
Add configuration file for netdata-updater.sh (#15149, @Ferroin)
Add hardening options to CFLAGS by default if they are available (#15087, @Ferroin)
Consistently start the agent as root and rely on it to drop privileges properly (#14890, @Ferroin)
Add support for openSUSE tumbleweed (#14692, @tkatsoulas)

Health

Removing some critical thresholds (#15124, @M4itee)
Fix evaluating expression with nan (#15348, @ilyam8)
Respect overriding nc binary for IRC notifications (#15310, @ilyam8)
Keep health log history in seconds (#15314, @MrZammler)
Fix windows alarms for virtual nodes (#15376, @ilyam8)

Exporting

Hide not available for viewers charts when exporting in the shell format (#15309, @ilyam8)
Fix slow exporting in Prometheus format (#15276, @ilyam8)

Other Notable Changes

Improvements

Enrichment of /api/v2, buildinfo improvements and code cleanup (#15294, @ktsaou)

Bug fixes

Fix unlocked registry access and add hostname to search response (#15426, @ktsaou)
Fix interpreting encoded URLs (#15422, @MrZammler)
Fix compilation on BSD (#15331, @thiagoftsm)
Fix virtual hosts showing up as stale nodes (#15313, @ktsaou)
Fix clean up of charts generated by external plugins (#15307, @stelfrag)
Fix crash when opening Alarms Log tab on the parent instance (#15306, @MrZammler)
Fix infinite loop in webserver (#15287, @ktsaou)

Code organization

Add chart id and name to alert instances and transitions (#15430, @ktsaou)
Use real-time clock for http response headers (#15421, @ktsaou)
Pre release fixes (#15405, @ktsaou)
Add expiration to bearer token response (#15392, @ktsaou)
Fix CodeQL alert (#15384, @stelfrag)
Update http response code descriptions (#15379, @ktsaou)
Suppress H2O compilation warnings (#15378, @stelfrag)
Fix coverity issues (#15375, @stelfrag)
Dont log error on opening .environment (#15371, @ilyam8)
Rename log_access and log_health (#15368, @MrZammler)
Agent alert notifications redirect (#15350, @ktsaou)
Bearer protection - additions (#15349, @ktsaou)
Bearer improvements (#15342, @ktsaou)
Add hostnames and items statistics to alerts_transitions outputs (#15329, @ktsaou)
Use spinlock in host and chart (#15328, @stelfrag)
Fix coverity issue 394862 - Argument cannot be negative (#15324, @stelfrag)
Rename log Macros (debug) (#15322, @thiagoftsm)
Bearer authorization API (#15321, @ktsaou)
Fix not using host prefix in read_cmdline in read_cmdline() (#15320, @ilyam8)
Update local-listener to use libnetdata (#15319, @ktsaou)
Avoid memory allocations for alert transitions facets processing (#15318, @ktsaou)
Add summary linking to alert instances (ati) when options=summary,values is requested (#15317, @ktsaou)
Fix alerts transitions sorting (#15315, @ktsaou)
Change info to netdata_log_info in sqlite_db_migration.c (#15303, @MrZammler)
Change query to store host system info values (#15300, @MrZammler)
Change info to netdata_log_info in profile.plugin (#15299, @vkalintiris)
Rename generic error function (#15296, @thiagoftsm)
Optimizations part 3 (#15293, @ktsaou)
Send alert chart labels config key to cloud (#15283, @MrZammler)
Optimizations part 2 (#15280, @ktsaou)
Misc alert fixes (#15274, @MrZammler)
Replace info macro with a less generic name (#15266, @carlocab)
Rewrite /api/v2/alerts (#15257, @ktsaou)
Use gperf for the pluginsd/streaming parser hashtable (#15251, @ktsaou)
URL rewrite at the agent web server to support multiple dashboard versions (#15247, @ktsaou)
Fix coverity 393183 & 393182 (#15234, @MrZammler)
Create index for health log migration (#15233, @stelfrag)
New alerts endpoint (#15232, @stelfrag)
Various /api/v2 improvements (#15227, @ktsaou)
Relax jnfv2 caching (#15224, @ktsaou)
Fix /api/v2/contexts,nodes,nodes_instances,q before match (#15223, @ktsaou)
Add recursive readers support to RW_SPINLOCK (#15217, @ktsaou)
Allow overriding pipename from env (#15215, @vkalintiris)
Memory reductions and optimizations (#15204, @ktsaou)
Agent dashboard reorganization (#15200, @Ferroin)
Add two functions that allow someone to start/stop ML (#15185, @vkalintiris)
Add streaming function and various improvements to /api/v2/nodes (#15168, @ktsaou)
Use a single health log table (#15157, @MrZammler)
Redirect to index.html when a file is not found by web server (#15143, @MrZammler)
Additional CO-RE code (eBPF.plugin) (#15078, @thiagoftsm)

Deprecation notice

There is not an obvious list of items that will be deprecated in the upcoming release (v1.42.0). Feel free to check and elaborate on the upcoming backlog

Deprecated in this release

In accordance with our previous deprecation notice, the following items in this release:

Component	Type	Will be replaced by
python.d/nvidia_smi	collector	go.d/nvidia_smi
`family` attribute	alert configuration and Health API	chart labels attribute (more details on netdata#15030)

Netdata Release Meetup

Join the Netdata team on the 21st of July at 17:00 UTC for the Netdata Release Meetup.

Together we’ll cover:

Release Highlights.
Acknowledgements.
Q&A with the community.

RSVP now - we look forward to meeting you.

Support options

As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels:

Netdata Learn: Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata.
GitHub Issues: Make use of the Netdata repository to report bugs or open a new feature request.
GitHub Discussions: Join the conversation around the Netdata development process and be a part of it.
Community Forums: Visit the Community Forums and contribute to the collaborative knowledge base.
Discord Server: Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1400 engineers are already using it!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.41.0

Netdata Growth

Release Highlights

New Agent Dashboard

Single Node Dashboard Changes

Multi Node Dashboards

Want to know more?

The old dashboards are still accessible

Netdata Assistant

New FreeIPMI collector for monitoring enterprise hardware

Netdata Detects FDs Leaking

Acknowledgements

Contributions

Collectors

Improvements

Bug fixes

Other

Documentation

Packaging / Installation

Health

Exporting

Other Notable Changes

Improvements

Bug fixes

Code organization

Deprecation notice

Deprecated in this release

Netdata Release Meetup

Support options

Contributors