Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connectors prototype #6140

Closed

Conversation

djaglowski
Copy link
Member

@djaglowski djaglowski commented Sep 22, 2022

Connectors

This is a working prototype of a new component type called "Connectors", which were proposed and described in #2336.

The general idea is that a connector acts as an exporter/receiver pair, bridging the gap between two or more pipelines. A connector can be used to merge data from multiple pipelines, or to split/replicate data to multiple pipelines. Connectors are configured declaratively, in the same way as receivers, processors, exporters, or extensions, and are integrated into the existing pipeline structure by being used in place of at least one exporter and at least one receiver.

The implementation includes two very simple connectors, nop and count, which allow for basic demonstration of functionality. (count is a bit contrived, but useful to illustrate a capability)

As much as possible, I've leaned on the existing notions of receivers and exporters. My goal is to allow existing functionality to do as much of the work as possible. Because of this we get a lot of useful functionality for free:

  • Connectors inherit the ability (or inability) to be used as receivers in a particulate type of pipeline. Same with regard to acting as exporters. (eg. count cannot be used as a receiver in a logs or traces pipeline)
  • When used as a receiver, a single instance can be shared by multiple pipelines. The built-in fanout capability is used automatically.

Sample use cases and configuration.

Count logs as they are forwarded. Report the count as a metric.

receivers:
  otlp:

exporters:
  otlp/log_backend:
  otlp/count_of_logs:

connectors:
  count:

service:
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [otlp/log_backend, count]
    metrics:
      receivers: [count]
      exporters: [otlp/count_of_logs]

Duplicate metrics. Forward one stream directly to a backend. Batch the other stream and forward it to a different backend.

receivers:
  otlp:

processors:
  batch:

exporters:
  otlp/raw_backend:
  otlp/batched_backend:

connectors:
  nop:

service:
  pipelines:
    metrics/in:
      receivers: [otlp]
      exporters: [nop]
    metrics/raw:
      receivers: [nop]
      exporters: [otlp/raw_backend]
    metrics/batched:
      receivers: [nop]
      processors: [batch]
      exporters: [otlp/batched_backend]

Merge traces from two pipelines, then batch them.

receivers:
  otlp/1:
  otlp/2:

processors:
  batch:

exporters:
  otlp:

connectors:
  nop:

service:
  pipelines:
    traces/1:
      receivers: [otlp/1]
      exporters: [nop]
    traces/2:
      receivers: [otlp/2]
      exporters: [nop]
    traces/out:
      receivers: [nop]
      processors: [batch]
      exporters: [otlp]

Several important aspects of the design are not yet implemented, but I wanted to share the prototype for early feedback.

TODO:

  • Tests / Documentation
  • Validation that a connector is actually used in two or more pipelines
    • At least once as a receiver
    • At least once as an exporter
  • ConnectorFactoryOptions to describe and enforce valid pipeline connections
    • The nop connector can handle logs -> logs, metrics -> metrics, traces -> traces
    • The count connector can handle logs -> metric, metrics -> metrics, traces -> metrics
  • Graph validation / cycle detection
    • Users should not be able to connect pipelines in such as way where data flow is cyclical.
  • Start/Stop ordering. Ideally, the overall graph is drained in such a way that signals are not stranded during shutdown.

@djaglowski djaglowski force-pushed the connectors-prototype branch 3 times, most recently from 3e75330 to eb13d1b Compare September 22, 2022 21:07

func newCountMetric(signalType string, count int) pmetric.Metrics {
ms := pmetric.NewMetrics()
rms := ms.ResourceMetrics().AppendEmpty()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this copy the resource and scope from the source data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the count connector is very rudimentary, mostly meant for illustrative purposes. If the general concept moves forward, it will probably do so initially without this component.

@djaglowski djaglowski force-pushed the connectors-prototype branch 2 times, most recently from 9440ca1 to 4b4ddbb Compare September 23, 2022 15:09
@codecov
Copy link

codecov bot commented Sep 23, 2022

Codecov Report

Base: 91.89% // Head: 91.23% // Decreases project coverage by -0.66% ⚠️

Coverage data is based on head (d8a0663) compared to base (96e9af3).
Patch coverage: 24.74% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6140      +/-   ##
==========================================
- Coverage   91.89%   91.23%   -0.67%     
==========================================
  Files         217      219       +2     
  Lines       13319    13413      +94     
==========================================
- Hits        12240    12237       -3     
- Misses        850      949      +99     
+ Partials      229      227       -2     
Impacted Files Coverage Δ
component/connector.go 0.00% <0.00%> (ø)
component/factories.go 80.00% <0.00%> (-20.00%) ⬇️
config/connector.go 0.00% <0.00%> (ø)
service/service.go 65.65% <0.00%> (-2.77%) ⬇️
...e/internal/configunmarshaler/defaultunmarshaler.go 89.05% <21.42%> (-10.95%) ⬇️
config/moved_config.go 93.75% <33.33%> (-6.25%) ⬇️
cmd/builder/internal/builder/config.go 67.90% <72.72%> (+0.29%) ⬆️
cmd/otelcorecol/components.go 65.90% <72.72%> (+2.27%) ⬆️
cmd/builder/internal/builder/main.go 26.19% <0.00%> (-28.58%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

return NewReceiverFactory(f.cfgType, f.createDefaultReceiverConfig, f.receiverFactoryOptions...)
}

// TODO Implement and enforce ConnectorFactoryOptions that enumerate valid signal combos.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought about this some more, I believe it will be necessary to provide the full matrix of Create<Type1>To<Type2>Func's, because each combination may have different behavior.

@bogdandrutu
Copy link
Member

@djaglowski thanks for doing this, I have couple of questions about the design:

  • Does this fit the collector starting order? The reason I am asking is because we will have to create 2 "instances" (one receiver and one exporter) and call start twice? I want to see if you see any problems with that.
  • Also this causes a bit of confusion to me that there are 2 default configs. Are we requiring to have 2 default configs?

I am struggling a bit with the design choice, I think we can add a proper new component component.Connector with a config.Connector and the factory very similar with any other factory (probably we need a way to configure next component later). The component.Connector can implement both interfaces component.Exporter and component.Receiver.

The late configuration for the "next consumer" can also be hacked, by passing a "lazy initialized consumer" which allows us to "swap" the next consumer there, we can check couple of designs there.

@djaglowski
Copy link
Member Author

Thanks for the feedback @bogdandrutu.

Does this fit the collector starting order? The reason I am asking is because we will have to create 2 "instances" (one receiver and one exporter) and call start twice? I want to see if you see any problems with that.
Also this causes a bit of confusion to me that there are 2 default configs. Are we requiring to have 2 default configs?

I am struggling a bit with the design choice, I think we can add a proper new component component.Connector with a config.Connector and the factory very similar with any other factory (probably we need a way to configure next component later). The component.Connector can implement both interfaces component.Exporter and component.Receiver.

I agree this implementation is quite awkward for all the reasons you mentioned. Technically, I was able to work with around the dual constructor/configs issue, but authoring the individual components required some assumptions about the order of component creation, which should not be a concern of the component author. I thought perhaps there might be a way to provide a connectorhelper package that would abstract all that from the author, but I am still not sure quite what that would look like yet. In any case, I think you are right and I will take another pass at this where connectors are treated a proper first-class component.

The late configuration for the "next consumer" can also be hacked, by passing a "lazy initialized consumer" which allows us to "swap" the next consumer there, we can check couple of designs there.

Good idea - I'll keep this in mind. At some point I need to take a closer look at start/stop order across the collector. I am not very familiar with this code yet but am hoping to build a topological ordering that ensures data is never blocked or stranded. If this works, I think we can pass next consumers in directly during construction, since we would build from the end of the graph backwards to the beginning in order to ensure data flow during startup.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Oct 12, 2022
@djaglowski
Copy link
Member Author

Quick update: I'm continuing to work on this and will open another PR that solidifies the new component type and also includes a graph-based implementation of service/internal/pipelines.Pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants