Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial node sensor readings in a cluster #65

Open
jeroenzwart opened this issue Jul 4, 2024 · 10 comments
Open

Partial node sensor readings in a cluster #65

jeroenzwart opened this issue Jul 4, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@jeroenzwart
Copy link

jeroenzwart commented Jul 4, 2024

Describe the bug
I have Proxmox on 2 nodes installed and made a cluster. After installing the great pve-mod-gui-sensors.sh script the sensors information is visible of a node. This is only visible when you look at the node where you currently log into. When selecting another node, the sensors information is not visible.

After reinstalling on both nodes almost the same problem occurs. Log in to node 1 and looking to the sensors information of node 2 will keep try loading. Then the other way around,log in to node 2 and looking to the sensors information of node 1 it will say 'N/A' at all sensors.

To Reproduce
Steps to reproduce the behavior:

  1. Install the script on multiple nodes (for example node 1 and node 2)
  2. Log in Proxmox GUI of a node 1
  3. Select node 1 and the sensors information is available in de node 1 summary
  4. Select node 2 and the sensors information is NOT available in de node summary and keeps loading.
  5. Log in Proxmox GUI of a node 2
  6. Select node 2 and the sensors information is available in de node 2 summary
  7. Select node 1 and the sensors information is NOT available in de node summary and keeps loading.

Expected behavior
Temperature information is available in the GUI of multiple nodes when logged into 1 node.

Additional context
Nodes are updated to Proxmox 8.2.4.

@jeroenzwart jeroenzwart added the bug Something isn't working label Jul 4, 2024
@Meliox
Copy link
Owner

Meliox commented Jul 5, 2024

Thanks for the nice report.

Yes, I don't believe it is working in a cluster setup right now. I am unsure why....

@eremem: Looking at the native function 'CPU usage' it seems like the whole renderer is entirely in the perl back-end and not in the js front-end. This is a slighly different approach that what we have (more mixed). Do you think that it could be the reason?
Could that in any way also be related to the increase in I/O delay seen in #63...

@eremem
Copy link
Collaborator

eremem commented Jul 5, 2024

@Meliox Unfortunately I have no setup to check this in action. Is there any difference in how the standard information for each node is obtained when in a cluster? If so, is this only the standard pieces of information that are available or also the sensor readings injected by us? Is the format the data is provided in different than for a single node so that our JS code can't access it anymore?
If the rendering code is provided by the back-end, it would be a reason why our code doesn't work - at least not for the other nodes. But still, the original description says that on the node one is logged in, the additional information are available, so the code from the back-end seems to be provided for the other nodes only (?)

@eremem
Copy link
Collaborator

eremem commented Jul 6, 2024

@Meliox I tried to find the function you mentioned in /usr/share/perl5/PVE but got nothing relevant. Could you please point me the code you found?

@Meliox
Copy link
Owner

Meliox commented Jul 6, 2024

@Meliox I tried to find the function you mentioned in /usr/share/perl5/PVE but got nothing relevant. Could you please point me the code you found?

Yes.

Calls are as follows:
/usr/share/pve-manager/js/pvemanagerlib.js line 45118 renderer: Proxmox.Utils.render_node_cpu_usage,

to

/usr/share/perl5/PVE/API2/Nodes.pm line 463 $res->{cpuinfo} = PVE::ProcFSTools::read_cpuinfo();

to

/usr/share/perl5/PVE/ProcFSTools.pm line 23 - sub read_cpuinfo {

@eremem
Copy link
Collaborator

eremem commented Jul 6, 2024

Could that in any way also be related to the increase in I/O delay seen in #63...

@Meliox Is #63 also a cluster setup? I'd assume there is rather on the OS level some problem with retrieving the sensor readings by the sensors command. This had to be ruled out before any further actions. On my system the results are shown almost instantaneously but on the affected system if there would a visible delay for sensors -j (when calling from the CLI) before or during showing the readings it would suggest there is some performance issue there. One should also watch the system I/O parameters in the second console at the same time to confirm the corelation.

@Meliox
Copy link
Owner

Meliox commented Jul 6, 2024

@eremem: It is not. I am not seeing any I/O delay either. I think that is a sound next step for investigation.

@eremem
Copy link
Collaborator

eremem commented Jul 6, 2024

Calls are as follows: /usr/share/pve-manager/js/pvemanagerlib.js line 45118 renderer: Proxmox.Utils.render_node_cpu_usage,
to
/usr/share/perl5/PVE/API2/Nodes.pm line 463 $res->{cpuinfo} = PVE::ProcFSTools::read_cpuinfo();
to
/usr/share/perl5/PVE/ProcFSTools.pm line 23 - sub read_cpuinfo {

@Meliox How about moving our injected code from /usr/share/perl5/PVE/API2/Nodes.pm for res->{sensorsOutput} from line 489 on to a new function, e.g. read_sensors in /usr/share/perl5/PVE/ProcFSTools.pm:

sub read_sensors {
    my $output = `sensors -j`;
    # sanitize JSON output
    $output =~ s/ERROR:.+\s(\w+):\s(.+)/\"$1\": 0.000,/g;
    $output =~ s/ERROR:.+\s(\w+)!/\"$1\": 0.000,/g;
    $output =~ s/,(.*[.\n]*.+})/$1/g;

    return $output;
}

and assigning its output to $res->{sensorsOutput} in /usr/share/perl5/PVE/API2/Nodes.pm:
$res->{sensorsOutput} = PVE::ProcFSTools::read_sensors()

This seems to still work in my single node configuration but before we start adjusting the installer script, we have to verify if this change brings anything at all in a cluster setup. Do you have access to some cluster setup? I could upload the modified files for quicker testing in the current PVE version...

It would also be interesting to verify if this new approach could also be applied to $res->{systemInfo}:
/usr/share/perl5/PVE/ProcFSTools.pm:

sub read_systeminfo {
    return "Manufacturer: Foobar Inc. | Product Name: Baz Server | Serial Number: 123456";
}

/usr/share/perl5/PVE/API2/Nodes.pm:
$res->{systemInfo} = PVE::ProcFSTools::read_systeminfo();

I'm curious if our approach would even work, since Proxmox has to provide the data from other nodes to the on the "observer" is logged onto but let's play naive and try the easiest obvious solution at first :).

@Meliox
Copy link
Owner

Meliox commented Jul 6, 2024

Yes. Sounds like a good approach to test it. Unfortunately I do not have access to one either, so we need help from @jeroenzwart. As you say it would be great to have the approach confirmed before the large code change.
I was actually thinking whether we just should make our own pm file to avoid all the messy sed editing in existing files. What do you think about that?
If yes, then we should aim for moving everything towards that to make code simple.

@eremem
Copy link
Collaborator

eremem commented Jul 6, 2024

If I'm not mistaken, most of that sed commands take care of the front-end JS code anyway, which, if moved to a separate file, has to be included at the run-time. I personally know no easy/elegant way how to achieve this. If you see some solution, let's discuss it in a separate PR.
For the back-end part of the code, we could do it (it looks like at least, since I barely know Perl:D) but the affected code is mostly (only?) what I mentioned above, so you tell me if it's worth the effort already at this point;). I would suggest to move this thread to a new PR, because this discussion slowly starts to obscure the actual issue:D

@jeroenzwart
Copy link
Author

Yes. Sounds like a good approach to test it. Unfortunately I do not have access to one either, so we need help from @jeroenzwart. As you say it would be great to have the approach confirmed before the large code change. I was actually thinking whether we just should make our own pm file to avoid all the messy sed editing in existing files. What do you think about that? If yes, then we should aim for moving everything towards that to make code simple.

I'm open to help, but in theory is possible to create 2 nodes of Proxmox in Proxmox itself? And create a cluster of 2 virtual nodes, right? I'm not sure what is easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants