Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

204 No Content transmitted #5

Open
henkela opened this issue Apr 2, 2019 · 7 comments
Open

204 No Content transmitted #5

henkela opened this issue Apr 2, 2019 · 7 comments
Labels
bug Something isn't working

Comments

@henkela
Copy link

henkela commented Apr 2, 2019

Hi,
finally I could set up both the web and the daemon part. But now I see that there's no Content transmitted.
Daemon
HTTP Request: '/v2/topologies/ib-test' (j/p/total)30/291/322 ms
gateway_1 | 172.18.0.1 - - [02/Apr/2019:12:20:52 +0000] "PUT /api/v2/topologies/ib-test HTTP/1.1" 204 25 "-" "-" "-"

and after that also
HTTP Request: '/v2/metrics/ib-test' (j/p/total)12/159/172 ms

    • [02/Apr/2019:13:18:48 +0000] "PUT /api/v2/metrics/ib-test HTTP/1.1" 204 25 "-" "-" "-"

ibnetdiscover, iblinkinfo, etc. is working. Daemon is running as root.
Any hints welcome.
Best,
Andreas

@carstenpatzke
Copy link
Member

Hey Andreas,
thanks for using the InfiniBand-Radar.

The API server will not send any response payload when the request was successful.
If there would be an error the server would send a 500 HTTP Code error.

... so in your case everthing is fine.

@henkela
Copy link
Author

henkela commented Apr 5, 2019

Hi Carsten,
I'm not sure because the web-app doesn't show anything - I mean data. The topology is not shown and no metrics are reported.
I added Verbose for the curl in the source code in the ApiClient.cpp. It seems like the Client cannot retrieve information from infiniband or it's just null.

@carstenpatzke carstenpatzke added the bug Something isn't working label Apr 5, 2019
@carstenpatzke
Copy link
Member

Oh ok, thats interesting...
Can you add some debug output in InfiniBandRadar.cpp / update_fabric_topology@Line 113?
Like std::cout << "Processing node: " << node->nodedesc << std::endl;

Maybe you are right, and the tool cannot detect any Topology :/

@henkela
Copy link
Author

henkela commented May 8, 2019

Finally, I added that line and saw a lot of output
However, first line after starting infiniband_radar_daemon was

src/query_smp.c:197; umad (DR path slid 0; dlid 0; 0,1,1,16,24 Attr 0x11:0) bad status 110; Connection timed out

followed by a long list of

processing node: mlx4_0

and

processing node: Infiniscale-IV Mellanox Technologies

After that there are the following three lines before everything starts over.

ibwarn: [11129] _do_madrpc: recv failed: Connection timed out
ibwarn: [11129] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
Sending initial topology

Inbetween there are also the metrics-requests like
HTTP Request: '/v2/topologies/ib-test' (j/p/total)21/144/165 ms
Send topology: 4821ms
HTTP Request: '/v2/metrics/ib-test' (j/p/total)18/143/161 ms
Send port stats: 653ms

Unfortunately, the web only shows the Button for Fabric 1 and if I click on it there are Hosts: 0

I had a look at the logs of the container infiniband-radar-web_api
which showed

TypeError: Cannot read property 'topologyRoot' of null
at TopologiesController. (/home/node/server/src/api/v2/controllers/TopologiesController.ts:44:89)
at step (/home/node/server/src/api/v2/controllers/TopologiesController.ts:57:23)
at Object.next (/home/node/server/src/api/v2/controllers/TopologiesController.ts:38:53)
at fulfilled (/home/node/server/src/api/v2/controllers/TopologiesController.ts:29:58)
at process._tickCallback (internal/process/next_tick.js:68:7)

The logs of the influxdb container look like

[httpd] 172.18.0.6 - root [08/May/2019:11:52:15 +0000] "POST /write?db=infiniband_radar&p=%5BREDACTED%5D&precision=n&rp=&u=root HTTP/1.1" 204 0 "-" "-" b941e705-7187-11e9-8b06-000000000000 89457
[httpd] 172.18.0.6 - root [08/May/2019:11:52:15 +0000] "POST /write?db=infiniband_radar&p=%5BREDACTED%5D&precision=n&rp=&u=root HTTP/1.1" 204 0 "-" "-" b950d7fc-7187-11e9-8b07-000000000000 7678

Any other debug available?

@carstenpatzke
Copy link
Member

carstenpatzke commented May 15, 2019

Thanks for all your effort and support.

The current errors (TypeError: Cannot read property 'topologyRoot' of null and the influx error)
are all caused by a non existing topology root, which should be created when the daemon starts.

My guess would be that something between the daemon API request and the database fails.
You already showed me that the nodes are detected and at least something is send to the right address.

HTTP Request: '/v2/topologies/ib-test' (j/p/total)21/144/165 ms
Send topology: 4821ms

The web-server should write the topology inside the database... so if you want, you can try to open the MongoDB file in the data directory and check if there is a TopologySnapshot (DB: infiniband_radar) that looks like this.

When you restart the web-server there should also be a warning
'ib-test' has never provided a topology! Is the daemon running?
when the server cannot find any stored topology.

(PS: I've deleted the duplicated comment)

@carstenpatzke
Copy link
Member

Did you manage to solve this issue?

@kcgthb
Copy link
Contributor

kcgthb commented Apr 22, 2020

Hi Carsten,
I'm actually seeing the same exact problem and nothing is displayed in the web interface.
When it first starts, the web server logs this:

Log level is: [Debug]
(node:16) ExperimentalWarning: The fs.promises API is experimental
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][MetricDatabase] Created database 'infiniband_radar'
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][MetricDatabase] Creating retention policy 'rp_14d' for database 'infiniband_radar'
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][MetricDatabase] Created retention policy 'rp_14d' for database 'infiniband_radar'
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][MetricDatabase] [edr] Updating global metric
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][TopologyDatabase] Setup 'mongodb:https://mongodb:27017/infiniband_radar'
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][TopologyDatabase] Update default topologies cache
[Wed, 22 Apr 2020 17:42:17 GMT][WARN][TopologyDatabase] No default timestamps are available! Starting the server for the first time?
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][TopologyDatabase] Fetch last snapshot for 'edr'
[Wed, 22 Apr 2020 17:42:17 GMT][WARN][TopologyDatabase] 'edr' has never provided a topology! Is the daemon running?
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][UserDatabase] Setup 'mongodb:https://mongodb:27017/infiniband_radar'
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][Server] Database setup complete
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][Server] Server startup complete
[Wed, 22 Apr 2020 17:42:17 GMT][INFO][Server] API Server is listening on http:https://0.0.0.0:4201/
[Wed, 22 Apr 2020 17:42:32 GMT][INFO][TopologiesController] [edr] Got first topology version
[Wed, 22 Apr 2020 17:42:32 GMT][Debug][ApiServer] Took 71 ms to process (PUT) '/api/v2/topologies/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:42:32 GMT][INFO][TopologyOptimizerService] [edr] Start optimization
[Wed, 22 Apr 2020 17:42:38 GMT][Debug][ApiServer] Took 74 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:42:43 GMT][Debug][ApiServer] Took 59 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:42:48 GMT][Debug][ApiServer] Took 30 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:42:53 GMT][Debug][ApiServer] Took 37 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:42:58 GMT][Debug][ApiServer] Took 34 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:43:03 GMT][Debug][ApiServer] Took 32 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[Wed, 22 Apr 2020 17:43:08 GMT][Debug][ApiServer] Took 37 ms to process (PUT) '/api/v2/metrics/edr' StatusCode: 204
[...]

One clue may be that when accessing Grafana, it shows the following warning:

Templating init failed
InfluxDB Error: error parsing query: found \/, expected identifier, string, number, bool at line 1, char 105

Maybe there's an issue with the parsing of some strings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants