Skip to content

Commit

Permalink
Trace ID aware load-balancing exporter - 1/4 (open-telemetry#1348)
Browse files Browse the repository at this point in the history
Description:

    Initial skeleton
    README

Link to tracking Issue: Partially solves open-telemetry#1724

Testing: unit tests

Documentation: README

Signed-off-by: Juraci Paixão Kröhling [email protected]
  • Loading branch information
jpkrohling committed Oct 28, 2020
1 parent 503f1e2 commit 8a6b76c
Show file tree
Hide file tree
Showing 12 changed files with 2,069 additions and 0 deletions.
1 change: 1 addition & 0 deletions exporter/loadbalancingexporter/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include ../../Makefile.Common
147 changes: 147 additions & 0 deletions exporter/loadbalancingexporter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Trace ID aware load-balancing exporter

This is an exporter that will consistently export spans belonging to the same trace to the same backend.

It requires a source of backend information to be provided: static, with a fixed list of backends, or DNS, with a hostname that will resolve to all IP addresses to use. The DNS resolver will periodically check for updates.

Note that only the Trace ID is used for the decision on which backend to use: the actual backend load isn't taken into consideration. Even though this load-balancer won't do round-robin balancing of the batches, the load distribution should be very similar among backends with a standard deviation under 5% at the current configuration.

IMPORTANT: this exporter assumes that all spans from the batch belong to the same trace. You *should* therefore use a processor that either splits the incoming batches into multiple batches, one per trace, or a processor that groups spans by trace, like the `groupbytrace`.

This load balancer is especially useful for backends configured with tail-based samplers, which make a decision based on the view of the full trace.

When a list of backends is updated, around 1/n of the space will be changed, so that the same trace ID might be directed to a different backend, where n is the number of backends. This should be stable enough for most cases, and the higher the number of backends, the less disruption it should cause. Still, if routing stability is important for your use case and your list of backends are constantly changing, consider using the `groupbytrace` processor. This way, traces are dispatched atomically to this exporter, and the same decision about the backend is made for the trace as a whole.

## Configuration

Refer to [config.yaml](./testdata/config.yaml) for detailed examples on using the processor.

* The `otlp` property configures the template used for building the OTLP exporter. Refer to the OTLP Exporter documentation for information on which options are available. Note that the `endpoint` property should not be set and will be overridden by this exporter with the backend endpoint.
* The `resolver` accepts either a `static` node, or a `dns`. If both are specified, `dns` takes precedence.
* The `hostname` property inside a `dns` node specifies the hostname to query in order to obtain the list of IP addresses.


Simple example
```yaml
receivers:
otlp:
protocols:
grpc:
endpoint: localhost:55680

processors:

exporters:
logging:
loadbalancing:
protocol:
otlp:
# all options from the OTLP exporter are supported
# except the endpoint
timeout: 1s
resolver:
static:
hostnames:
- backend-1:55680
- backend-2:55680
- backend-3:55680
- backend-4:55680

service:
pipelines:
traces:
receivers:
- otlp
processors: []
exporters:
- loadbalancing
```

For testing purposes, the following configuration can be used, where both the load balancer and all backends are running locally:
```yaml
receivers:
otlp/loadbalancer:
protocols:
grpc:
endpoint: localhost:55680
otlp/backend-1:
protocols:
grpc:
endpoint: localhost:55690
otlp/backend-2:
protocols:
grpc:
endpoint: localhost:55700
otlp/backend-3:
protocols:
grpc:
endpoint: localhost:55710
otlp/backend-4:
protocols:
grpc:
endpoint: localhost:55720

processors:

exporters:
logging:
loadbalancing:
protocol:
otlp:
timeout: 1s
insecure: true
resolver:
static:
hostnames:
- localhost:55690
- localhost:55700
- localhost:55710
- localhost:55720

service:
pipelines:
traces/loadbalancer:
receivers:
- otlp/loadbalancer
processors: []
exporters:
- loadbalancing

traces/backend-1:
receivers:
- otlp/backend-1
processors: []
exporters:
- logging

traces/backend-2:
receivers:
- otlp/backend-2
processors: []
exporters:
- logging

traces/backend-3:
receivers:
- otlp/backend-3
processors: []
exporters:
- logging

traces/backend-4:
receivers:
- otlp/backend-4
processors: []
exporters:
- logging
```

## Metrics

The following metrics are recorded by this processor:

* `otelcol_loadbalancer_num_resolutions` represents the total number of resolutions performed by the resolver specified in the tag `resolver`, split by their outcome (`success=true|false`). For the static resolver, this should always be `1` with the tag `success=true`.
* `otelcol_loadbalancer_num_backends` informs how many backends are currently in use. It should always match the number of items specified in the configuration file in case the `static` resolver is used, and should eventually (seconds) catch up with the DNS changes. Note that DNS caches that might exist between the load balancer and the record authority will influence how long it takes for the load balancer to see the change.
* `otelcol_loadbalancer_num_backend_updates` records how many of the resolutions resulted in a new list of backends. Use this information to understand how frequent your backend updates are and how often the ring is rebalanced. If the DNS hostname is always returning the same list of IP addresses but this metric keeps increasing, it might indicate a bug in the load balancer.
* `otelcol_loadbalancer_backend_latency` measures the latency for each backend.
* `otelcol_loadbalancer_backend_outcome` counts what the outcomes were for each endpoint, `success=true|false`.
48 changes: 48 additions & 0 deletions exporter/loadbalancingexporter/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http:https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package loadbalancingexporter

import (
"go.opentelemetry.io/collector/config/configmodels"
"go.opentelemetry.io/collector/exporter/otlpexporter"
)

// Config defines configuration for the exporter.
type Config struct {
configmodels.ExporterSettings `mapstructure:",squash"`
Protocol Protocol `mapstructure:"protocol"`
Resolver ResolverSettings `mapstructure:"resolver"`
}

// Protocol holds the individual protocol-specific settings. Only OTLP is supported at the moment.
type Protocol struct {
OTLP otlpexporter.Config `mapstructure:"otlp"`
}

// ResolverSettings defines the configurations for the backend resolver
type ResolverSettings struct {
Static *StaticResolver `mapstructure:"static"`
DNS *DNSResolver `mapstructure:"dns"`
}

// StaticResolver defines the configuration for the resolver providing a fixed list of backends
type StaticResolver struct {
Hostnames []string `mapstructure:"hostnames"`
}

// DNSResolver defines the configuration for the DNS resolver
type DNSResolver struct {
Hostname string `mapstructure:"hostname"`
}
36 changes: 36 additions & 0 deletions exporter/loadbalancingexporter/config_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http:https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package loadbalancingexporter

import (
"path"
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/component/componenttest"
"go.opentelemetry.io/collector/config/configtest"
)

func TestLoadConfig(t *testing.T) {
factories, err := componenttest.ExampleComponents()
assert.NoError(t, err)

factories.Exporters[typeStr] = NewFactory()

cfg, err := configtest.LoadConfigFile(t, path.Join(".", "testdata", "config.yaml"), factories)
require.NoError(t, err)
require.NotNil(t, cfg)
}
57 changes: 57 additions & 0 deletions exporter/loadbalancingexporter/exporter.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http:https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package loadbalancingexporter

import (
"context"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/configmodels"
"go.opentelemetry.io/collector/consumer/pdata"
"go.uber.org/zap"
)

var _ component.TraceExporter = (*exporterImp)(nil)

type exporterImp struct {
logger *zap.Logger
config Config
}

// Crete new exporter
func newExporter(params component.ExporterCreateParams, cfg configmodels.Exporter) (*exporterImp, error) {
oCfg := cfg.(*Config)

return &exporterImp{
logger: params.Logger,
config: *oCfg,
}, nil
}

func (e *exporterImp) Start(ctx context.Context, host component.Host) error {
return nil
}

func (e *exporterImp) Shutdown(context.Context) error {
return nil
}

func (e *exporterImp) ConsumeTraces(ctx context.Context, td pdata.Traces) error {
return nil
}

func (e *exporterImp) GetCapabilities() component.ProcessorCapabilities {
return component.ProcessorCapabilities{MutatesConsumedData: false}
}
38 changes: 38 additions & 0 deletions exporter/loadbalancingexporter/exporter_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http:https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package loadbalancingexporter

import (
"testing"

"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/component"
"go.uber.org/zap"
)

func TestNewExporter(t *testing.T) {
// prepare
config := &Config{}
params := component.ExporterCreateParams{
Logger: zap.NewNop(),
}

// test
p, err := newExporter(params, config)

// verify
require.NoError(t, err)
require.NotNil(t, p)
}
50 changes: 50 additions & 0 deletions exporter/loadbalancingexporter/factory.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http:https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package loadbalancingexporter

import (
"context"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/configmodels"
"go.opentelemetry.io/collector/exporter/exporterhelper"
)

const (
// The value of "type" key in configuration.
typeStr = "loadbalancing"
)

// NewFactory creates a factory for the exporter.
func NewFactory() component.ExporterFactory {
return exporterhelper.NewFactory(
typeStr,
createDefaultConfig,
exporterhelper.WithTraces(createTraceExporter),
)
}

func createDefaultConfig() configmodels.Exporter {
return &Config{
ExporterSettings: configmodels.ExporterSettings{
TypeVal: typeStr,
NameVal: typeStr,
},
}
}

func createTraceExporter(_ context.Context, params component.ExporterCreateParams, cfg configmodels.Exporter) (component.TraceExporter, error) {
return newExporter(params, cfg)
}
Loading

0 comments on commit 8a6b76c

Please sign in to comment.