Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[httpcheckreceiver] new receiver #14191

Merged
merged 6 commits into from
Oct 18, 2022

Conversation

codeboten
Copy link
Contributor

@codeboten codeboten commented Sep 19, 2022

The HTTP Check Receiver can be used for synthethic checks against HTTP endpoints. This receiver will make a request to the specified endpoint using the configured method.

Here's an example configuration:

receivers:
  httpcheck:
    endpoint: http:https://endpoint:80
    method: GET
    collection_interval: 10s

The metrics being emitted:

ResourceMetrics #0
Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/httpcheckreceiver v0.62.0-6-g847c930f95
Metric #0
Descriptor:
     -> Name: httpcheck.duration
     -> Description: Measures the duration of the HTTP check.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://lightstep.com)
StartTimestamp: 2022-10-13 18:34:04.43338 +0000 UTC
Timestamp: 2022-10-13 18:34:14.436818 +0000 UTC
Value: 204
Metric #1
Descriptor:
     -> Name: httpcheck.status
     -> Description: 1 if the endpoint resulted in status http.status_code, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> http.method: Str(GET)
     -> http.status_class: Str(3xx)
     -> http.status_code: Int(200)
     -> http.url: Str(https://lightstep.com)
StartTimestamp: 2022-10-13 18:34:04.43338 +0000 UTC
Timestamp: 2022-10-13 18:34:14.436818 +0000 UTC
Value: 0
NumberDataPoints #1
Data point attributes:
     -> http.method: Str(GET)
     -> http.status_class: Str(4xx)
     -> http.status_code: Int(200)
     -> http.url: Str(https://lightstep.com)
StartTimestamp: 2022-10-13 18:34:04.43338 +0000 UTC
Timestamp: 2022-10-13 18:34:14.436818 +0000 UTC
Value: 0
NumberDataPoints #2
Data point attributes:
     -> http.method: Str(GET)
     -> http.status_class: Str(5xx)
     -> http.status_code: Int(200)
     -> http.url: Str(https://lightstep.com)
StartTimestamp: 2022-10-13 18:34:04.43338 +0000 UTC
Timestamp: 2022-10-13 18:34:14.436818 +0000 UTC
Value: 0
NumberDataPoints #3
Data point attributes:
     -> http.method: Str(GET)
     -> http.status_class: Str(1xx)
     -> http.status_code: Int(200)
     -> http.url: Str(https://lightstep.com)
StartTimestamp: 2022-10-13 18:34:04.43338 +0000 UTC
Timestamp: 2022-10-13 18:34:14.436818 +0000 UTC
Value: 0
NumberDataPoints #4
Data point attributes:
     -> http.method: Str(GET)
     -> http.status_class: Str(2xx)
     -> http.status_code: Int(200)
     -> http.url: Str(https://lightstep.com)
StartTimestamp: 2022-10-13 18:34:04.43338 +0000 UTC
Timestamp: 2022-10-13 18:34:14.436818 +0000 UTC
Value: 1
	{"kind": "exporter", "data_type": "metrics", "name": "logging"}

Fixes #10607

@codeboten codeboten marked this pull request as ready for review September 20, 2022 15:40
@codeboten codeboten requested a review from a team as a code owner September 20, 2022 15:40
receiver/httpcheckreceiver/scraper.go Show resolved Hide resolved
receiver/httpcheckreceiver/scraper.go Outdated Show resolved Hide resolved
type: string

metrics:
httpcheck.success:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth considering any other representations of this metric? One thought is that if this were httpcheck.status where the value is the status code, then you wouldn't have to interpret what success means. Success, failure, etc could be determined on the query side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, the representation could be the status code itself (which is currently an attribute). I don't have a strong preference one way or the other. The success/failure metric is similar to probe_success emitted by https://github.com/prometheus/blackbox_exporter. It does mean that this receiver will need to have a way for users to configure success/failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, do you expect users to check the method attribute? e.g. probe both GET and POST in the same service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't expect someone to want to hit the same endpoint with different methods. i would expect consumers of the telemetry to want to be able to filter responses by the method though.

Copy link
Member

@djaglowski djaglowski Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make a lot of sense to keep the status code as an attribute, but also to change the meaning of the value from "success" to "a single observation of this status code".

You would expect subsequent observations to look like this:

{ "time": "t0", "url": "/foo", "method": "GET", "code": 200, "value": 1 }
{ "time": "t1", "url": "/foo", "method": "GET", "code": 200, "value": 1 }
{ "time": "t2", "url": "/foo", "method": "GET", "code": 404, "value": 1 } // outage
{ "time": "t3", "url": "/foo", "method": "GET", "code": 200, "value": 1 }

See my other comment for more on why I think this model is useful for summarization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my suggestion to make this a 0-1 metric similar to or perhaps specified to create an OpenMetrics StateSet. @dashpole note the connection to #1712 and #2409.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed your other comment, and I think this would be a great way to represent this sort of information! A unit seems like the best place I can think of for putting that information, and it makes sense in contexts other than Prometheus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we could take a similar approach for info metrics as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@carlosalberto
Copy link
Contributor

Symbolic approval, this looks good! Also interested to help improve this with a few potential additions ;)

receiver/httpcheckreceiver/metadata.yaml Outdated Show resolved Hide resolved
receiver/httpcheckreceiver/metadata.yaml Outdated Show resolved Hide resolved
receiver/httpcheckreceiver/metadata.yaml Outdated Show resolved Hide resolved
type: string

metrics:
httpcheck.success:
Copy link
Member

@djaglowski djaglowski Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make a lot of sense to keep the status code as an attribute, but also to change the meaning of the value from "success" to "a single observation of this status code".

You would expect subsequent observations to look like this:

{ "time": "t0", "url": "/foo", "method": "GET", "code": 200, "value": 1 }
{ "time": "t1", "url": "/foo", "method": "GET", "code": 200, "value": 1 }
{ "time": "t2", "url": "/foo", "method": "GET", "code": 404, "value": 1 } // outage
{ "time": "t3", "url": "/foo", "method": "GET", "code": 200, "value": 1 }

See my other comment for more on why I think this model is useful for summarization.

receiver/httpcheckreceiver/metadata.yaml Outdated Show resolved Hide resolved
receiver/httpcheckreceiver/metadata.yaml Outdated Show resolved Hide resolved
receiver/httpcheckreceiver/metadata.yaml Outdated Show resolved Hide resolved
type: string

metrics:
httpcheck.success:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my suggestion to make this a 0-1 metric similar to or perhaps specified to create an OpenMetrics StateSet. @dashpole note the connection to #1712 and #2409.

@codeboten
Copy link
Contributor Author

Thanks to everyone for all the input. I've updated httpcheck.status to produce a data point for each http response class (1xx, 2xx, 3xx, 4xx, 5xx). The resulting data looks something like this:
Screen Shot 2022-10-13 at 12 23 47 PM

I've also added another metric to allow users to count the number of errors that occurred. Previously, i was using the status code -1 which seemed a bit cryptic.

@mwear @djaglowski @jmacd PTAL

@codeboten codeboten force-pushed the codeboten/httpcheck branch 2 times, most recently from 3a9177c to 7ad3976 Compare October 13, 2022 20:58
@codeboten codeboten force-pushed the codeboten/httpcheck branch 2 times, most recently from 7586751 to e46a175 Compare October 14, 2022 15:30
Alex Boten added 5 commits October 14, 2022 11:15
The HTTP Check Receiver can be used for synthethic checks against HTTP endpoints. This receiver will make a request to the specified `endpoint` using the configured `method`.

Fixes #10607
@codeboten codeboten merged commit bf9686b into open-telemetry:main Oct 18, 2022
@codeboten codeboten deleted the codeboten/httpcheck branch October 18, 2022 17:53
shalper2 pushed a commit to shalper2/opentelemetry-collector-contrib that referenced this pull request Dec 6, 2022
The HTTP Check Receiver can be used for synthethic checks against HTTP endpoints. This receiver will make a request to the specified `endpoint` using the configured `method`.

Fixes open-telemetry#10607
@plantfansam plantfansam mentioned this pull request Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New component: HTTP Check
7 participants