Partial node sensor readings in a cluster #65

jeroenzwart · 2024-07-04T19:56:30Z

Describe the bug
I have Proxmox on 2 nodes installed and made a cluster. After installing the great pve-mod-gui-sensors.sh script the sensors information is visible of a node. This is only visible when you look at the node where you currently log into. When selecting another node, the sensors information is not visible.

After reinstalling on both nodes almost the same problem occurs. Log in to node 1 and looking to the sensors information of node 2 will keep try loading. Then the other way around,log in to node 2 and looking to the sensors information of node 1 it will say 'N/A' at all sensors.

To Reproduce
Steps to reproduce the behavior:

Install the script on multiple nodes (for example node 1 and node 2)
Log in Proxmox GUI of a node 1
Select node 1 and the sensors information is available in de node 1 summary
Select node 2 and the sensors information is NOT available in de node summary and keeps loading.
Log in Proxmox GUI of a node 2
Select node 2 and the sensors information is available in de node 2 summary
Select node 1 and the sensors information is NOT available in de node summary and keeps loading.

Expected behavior
Temperature information is available in the GUI of multiple nodes when logged into 1 node.

Additional context
Nodes are updated to Proxmox 8.2.4.

The text was updated successfully, but these errors were encountered:

Meliox · 2024-07-05T19:28:06Z

Thanks for the nice report.

Yes, I don't believe it is working in a cluster setup right now. I am unsure why....

@eremem: Looking at the native function 'CPU usage' it seems like the whole renderer is entirely in the perl back-end and not in the js front-end. This is a slighly different approach that what we have (more mixed). Do you think that it could be the reason?
Could that in any way also be related to the increase in I/O delay seen in #63...

eremem · 2024-07-05T20:03:34Z

@Meliox Unfortunately I have no setup to check this in action. Is there any difference in how the standard information for each node is obtained when in a cluster? If so, is this only the standard pieces of information that are available or also the sensor readings injected by us? Is the format the data is provided in different than for a single node so that our JS code can't access it anymore?
If the rendering code is provided by the back-end, it would be a reason why our code doesn't work - at least not for the other nodes. But still, the original description says that on the node one is logged in, the additional information are available, so the code from the back-end seems to be provided for the other nodes only (?)

eremem · 2024-07-06T08:19:08Z

@Meliox I tried to find the function you mentioned in /usr/share/perl5/PVE but got nothing relevant. Could you please point me the code you found?

Meliox · 2024-07-06T08:25:12Z

@Meliox I tried to find the function you mentioned in /usr/share/perl5/PVE but got nothing relevant. Could you please point me the code you found?

Yes.

Calls are as follows:
/usr/share/pve-manager/js/pvemanagerlib.js line 45118 renderer: Proxmox.Utils.render_node_cpu_usage,

to

/usr/share/perl5/PVE/API2/Nodes.pm line 463 $res->{cpuinfo} = PVE::ProcFSTools::read_cpuinfo();

to

/usr/share/perl5/PVE/ProcFSTools.pm line 23 - sub read_cpuinfo {

eremem · 2024-07-06T08:35:19Z

Could that in any way also be related to the increase in I/O delay seen in #63...

@Meliox Is #63 also a cluster setup? I'd assume there is rather on the OS level some problem with retrieving the sensor readings by the sensors command. This had to be ruled out before any further actions. On my system the results are shown almost instantaneously but on the affected system if there would a visible delay for sensors -j (when calling from the CLI) before or during showing the readings it would suggest there is some performance issue there. One should also watch the system I/O parameters in the second console at the same time to confirm the corelation.

Meliox · 2024-07-06T08:38:25Z

@eremem: It is not. I am not seeing any I/O delay either. I think that is a sound next step for investigation.

eremem · 2024-07-06T09:13:00Z

Calls are as follows: /usr/share/pve-manager/js/pvemanagerlib.js line 45118 renderer: Proxmox.Utils.render_node_cpu_usage,
to
/usr/share/perl5/PVE/API2/Nodes.pm line 463 $res->{cpuinfo} = PVE::ProcFSTools::read_cpuinfo();
to
/usr/share/perl5/PVE/ProcFSTools.pm line 23 - sub read_cpuinfo {

@Meliox How about moving our injected code from /usr/share/perl5/PVE/API2/Nodes.pm for res->{sensorsOutput} from line 489 on to a new function, e.g. read_sensors in /usr/share/perl5/PVE/ProcFSTools.pm:

sub read_sensors {
    my $output = `sensors -j`;
    # sanitize JSON output
    $output =~ s/ERROR:.+\s(\w+):\s(.+)/\"$1\": 0.000,/g;
    $output =~ s/ERROR:.+\s(\w+)!/\"$1\": 0.000,/g;
    $output =~ s/,(.*[.\n]*.+})/$1/g;

    return $output;
}

and assigning its output to $res->{sensorsOutput} in /usr/share/perl5/PVE/API2/Nodes.pm:
$res->{sensorsOutput} = PVE::ProcFSTools::read_sensors()

This seems to still work in my single node configuration but before we start adjusting the installer script, we have to verify if this change brings anything at all in a cluster setup. Do you have access to some cluster setup? I could upload the modified files for quicker testing in the current PVE version...

It would also be interesting to verify if this new approach could also be applied to $res->{systemInfo}:
/usr/share/perl5/PVE/ProcFSTools.pm:

sub read_systeminfo {
    return "Manufacturer: Foobar Inc. | Product Name: Baz Server | Serial Number: 123456";
}

/usr/share/perl5/PVE/API2/Nodes.pm:
$res->{systemInfo} = PVE::ProcFSTools::read_systeminfo();

I'm curious if our approach would even work, since Proxmox has to provide the data from other nodes to the on the "observer" is logged onto but let's play naive and try the easiest obvious solution at first :).

Meliox · 2024-07-06T09:36:21Z

Yes. Sounds like a good approach to test it. Unfortunately I do not have access to one either, so we need help from @jeroenzwart. As you say it would be great to have the approach confirmed before the large code change.
I was actually thinking whether we just should make our own pm file to avoid all the messy sed editing in existing files. What do you think about that?
If yes, then we should aim for moving everything towards that to make code simple.

eremem · 2024-07-06T13:56:47Z

If I'm not mistaken, most of that sed commands take care of the front-end JS code anyway, which, if moved to a separate file, has to be included at the run-time. I personally know no easy/elegant way how to achieve this. If you see some solution, let's discuss it in a separate PR.
For the back-end part of the code, we could do it (it looks like at least, since I barely know Perl:D) but the affected code is mostly (only?) what I mentioned above, so you tell me if it's worth the effort already at this point;). I would suggest to move this thread to a new PR, because this discussion slowly starts to obscure the actual issue:D

jeroenzwart · 2024-07-07T12:46:08Z

Yes. Sounds like a good approach to test it. Unfortunately I do not have access to one either, so we need help from @jeroenzwart. As you say it would be great to have the approach confirmed before the large code change. I was actually thinking whether we just should make our own pm file to avoid all the messy sed editing in existing files. What do you think about that? If yes, then we should aim for moving everything towards that to make code simple.

I'm open to help, but in theory is possible to create 2 nodes of Proxmox in Proxmox itself? And create a cluster of 2 virtual nodes, right? I'm not sure what is easier.

jeroenzwart added the bug Something isn't working label Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial node sensor readings in a cluster #65

Partial node sensor readings in a cluster #65

jeroenzwart commented Jul 4, 2024 •

edited

Loading

Meliox commented Jul 5, 2024 •

edited

Loading

eremem commented Jul 5, 2024 •

edited

Loading

eremem commented Jul 6, 2024

Meliox commented Jul 6, 2024

eremem commented Jul 6, 2024

Meliox commented Jul 6, 2024

eremem commented Jul 6, 2024 •

edited

Loading

Meliox commented Jul 6, 2024 •

edited

Loading

eremem commented Jul 6, 2024 •

edited

Loading

jeroenzwart commented Jul 7, 2024

Partial node sensor readings in a cluster #65

Partial node sensor readings in a cluster #65

Comments

jeroenzwart commented Jul 4, 2024 • edited Loading

Meliox commented Jul 5, 2024 • edited Loading

eremem commented Jul 5, 2024 • edited Loading

eremem commented Jul 6, 2024

Meliox commented Jul 6, 2024

eremem commented Jul 6, 2024

Meliox commented Jul 6, 2024

eremem commented Jul 6, 2024 • edited Loading

Meliox commented Jul 6, 2024 • edited Loading

eremem commented Jul 6, 2024 • edited Loading

jeroenzwart commented Jul 7, 2024

jeroenzwart commented Jul 4, 2024 •

edited

Loading

Meliox commented Jul 5, 2024 •

edited

Loading

eremem commented Jul 5, 2024 •

edited

Loading

eremem commented Jul 6, 2024 •

edited

Loading

Meliox commented Jul 6, 2024 •

edited

Loading

eremem commented Jul 6, 2024 •

edited

Loading