Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xds] refactor config provider framework #7704

Merged
merged 29 commits into from
Aug 1, 2019
Merged
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
6a3f13b
add scope key-builder impl and test
stevenzzzz Jun 11, 2019
cc806d7
Merge branch 'master' into add-key-builder
stevenzzzz Jun 11, 2019
bb55a44
fix typo
stevenzzzz Jun 11, 2019
8d7b4da
move impl code from header to cc file.
stevenzzzz Jun 12, 2019
8bcddce
add review fixes and unit test case for keybuidler
stevenzzzz Jun 12, 2019
b1d0d5c
clang-tidy fix
stevenzzzz Jun 12, 2019
1007ee0
use expect_debug_death for assertion test
stevenzzzz Jun 12, 2019
184ff2f
fix failed test in opt mode
stevenzzzz Jun 12, 2019
fe6b643
add assert around addFragment and test, and a few review fix
stevenzzzz Jun 13, 2019
fc98e56
Merge branch 'master' into add-key-builder
stevenzzzz Jun 19, 2019
951bd28
Merge branch 'master' into add-key-builder
stevenzzzz Jun 20, 2019
75718d9
fix based on feedbacks
stevenzzzz Jun 21, 2019
d84839d
fix nits
stevenzzzz Jun 25, 2019
27ff0a3
switch from constructing proto memeber by value(copy-elision) to more…
stevenzzzz Jun 25, 2019
467a6a1
refactor config provider framework
stevenzzzz Jul 24, 2019
7dea1a0
some renames and stale comment fix
stevenzzzz Jul 24, 2019
2b5f572
Merge branch 'master' of https://github.com/envoyproxy/envoy into add…
stevenzzzz Jul 25, 2019
86c455e
fixes for review feedbacks
stevenzzzz Jul 30, 2019
60dd433
fix comment
stevenzzzz Jul 30, 2019
c31de28
Merge branch 'master' of https://github.com/envoyproxy/envoy into CLEAN
stevenzzzz Jul 30, 2019
dad2d59
snowp comment, merge with upstream master
stevenzzzz Jul 30, 2019
6b6a0ad
Merge branch 'master' into config-provider-update
stevenzzzz Jul 30, 2019
fb28731
fix tidy format error
stevenzzzz Jul 31, 2019
1bd43be
fixes per htuch review feedbacks
stevenzzzz Jul 31, 2019
f074e9d
improve the comment per htuch's suggestion.
stevenzzzz Jul 31, 2019
7cbb6ae
fix comment nit
stevenzzzz Jul 31, 2019
fb7867b
move updateConfigCb and add a description for it.
stevenzzzz Jul 31, 2019
47a417d
Merge branch 'master' of https://github.com/envoyproxy/envoy
stevenzzzz Aug 1, 2019
e328d46
Merge branch 'master' into config-provider-update
stevenzzzz Aug 1, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions source/common/config/config_provider_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -219,18 +219,21 @@ class ConfigSubscriptionCommonBase
using ConfigUpdateCb =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this top-level and commented, rather than part of the function signature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to do this one :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Sorry I missed this one earlier.

std::function<ConfigProvider::ConfigConstSharedPtr(ConfigProvider::ConfigConstSharedPtr)>;
void applyConfigUpdate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplification introduced by generalizing this function. It avoids delta vs non-delta code bloat which was a pending TODO.

With that said, after thinking about this quite a bit, I think the "ConfigImpl" and the thread local storage on which it is stored are a better fit to be owned by the MutableConfigProviderCommonBase. This maintains the existing boundary between the *ConfigSubscription* base classes which manage the underlying xDS subscription, and the *ConfigProvider* base classes which manage the ConfigConstSharedPtr resulting from the config update. As it stands now with these changes, the *ConfigProvider* classes are just a thin shim to provide access to the ConfigSubscription and could simply be removed.

The proposal above would reintroduce the notion of bound config providers to a subscription, but I propose a new simplification here: instead of having multiple config providers bound to a subscription, have a single bound provider which owns the tls_ and associated thread local ConfigConstSharedPtrs. Instead of ConfigProviderManager::createXdsConfigProvider returning a new instance of a ConfigProviderPtr on every call, it would instead return a reference to the instance bound to the subscription (allocating a new instance if a bound one is not found).

Effectively, this would translate into moving applyConfigUpdate to MutableConfigProviderCommonBase along with tls_. MutableConfigProviderBase and DeltaMutableConfigProvider are still removed in this proposal.

The modified approach would achieve several goals:

  1. Maintains the boundary between *ConfigSubscription* classes managing the xDS subscription and the *ConfigProvider* classes managing the resulting config objects.
  2. Retains most of the simplification introduced by these changes and doesn't fundamentally change the approach, allowing most of the code to simply be moved.
  3. Creates an API that enforces users of the ConfigProvider framework to think about the desired lifetime of the ConfigConstSharedPtr returned by the update_fn. This is beneficial since one of the key goals of introducing this framework is to minimize proliferation of copies of the config across consumers of the config.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had an offline discussion, the way proposed in above comment has a problem around the ownership of subscription&Providers, it can't guarantee a subscription gets teared down when there are no users(providers).

We agreed that having a subscription maintains/owns the Config instance(s) make more sense for:
- Simpler update propagation: providers no need to worry about Config impl update at all.
- enforced x-providers Config sharing (which is the design goal of the provider framework)
- Keeps the same ownership mode as the existing RDS does, a subscription is shared between providers to ensure its life circle align with users'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • enforced x-providers Config sharing (which is the design goal of the provider framework)

To provide some more context, the design goal is to minimize duplication of config across consumers. The tradeoff that was made with the initial implementation of this framework was to provide more flexibility in how "config implementations" are stored (by having the ConfigProvider own the ConfigConstSharedPtr as opposed to the ConfigSubscription). After reflecting on the simplification introduced by this PR, I agree with Xin that it's preferable to eliminate that flexibility and instead enforce that a config implementation can at most be instantiated per ConfigSubscription and per worker thread; I can't come up with any use cases where duplication beyond that would make sense (i.e., per ConfigProvider as well).

const ConfigUpdateFn& update_fn, const Event::PostCb& complete_cb = []() {}) {
const ConfigUpdateCb& update_fn, const Event::PostCb& complete_cb = []() {}) {
// It is safe to call shared_from_this here as this is in main thread, and destruction of a
// ConfigSubscriptionCommonBase owner (i.e., a provider) happens in main thread as well.
auto shared_this = shared_from_this();
tls_->runOnAllThreads(
[this, update_fn]() {
tls_->getTyped<ThreadLocalConfig>().config_ = update_fn(this->getConfig());
},
/*During the update propagation, a subscription may get teared down in main thread due to
/* During the update propagation, a subscription may get teared down in main thread due to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: please use // comments here for the block comment.

Other than these tiny nits, LGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx!

all owners/providers destructed in a xDS update (e.g. LDS demolishes a RouteConfigProvider
and its subscription). Hold a reference to the shared subscription instance to make sure the
update can be safely pushed to workers in such an event.*/
and its subscription).
If such a race condition happens, hold a reference to the "*this" subscription instance in
this cb will ensure the shared "*this" gets posted back to main thread, after all the
workers finish calling the update_fn, at which point it's safe to destruct "*this"
instance. */
[shared_this, complete_cb]() { complete_cb(); });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the capture be in the first lambda above, where this is being referenced, rather than here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. If the race ever happens, we need to make sure "this" gets destructed on the main thread, as the subscription contains a tls_ instance. Capturing "this" in the complete_cb of runOnAllThreads makes sure that "*this" is alive after every first-lambda is called, and the held "*this" is posted back to main thread.

If we capture it in the first lambda, the shared_this may get destructed in a worker thread, cause an assertion failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this more explicit in the comment? This is a subtle correctness condition, but I think you are right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Comment updated!

}

Expand Down