Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coordinator] Add mapping rule config support for keeping metrics at different resolutions #2036

Merged
merged 27 commits into from
Jan 3, 2020

Conversation

robskillington
Copy link
Collaborator

What this PR does / why we need it:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

@codecov
Copy link

codecov bot commented Nov 4, 2019

Codecov Report

Merging #2036 into master will decrease coverage by 11.3%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2036      +/-   ##
=========================================
- Coverage    72.4%   61.1%   -11.4%     
=========================================
  Files        1007     995      -12     
  Lines       86647   85473    -1174     
=========================================
- Hits        62805   52243   -10562     
- Misses      19609   29272    +9663     
+ Partials     4233    3958     -275
Flag Coverage Δ
#aggregator 71.3% <ø> (+5.1%) ⬆️
#cluster 58.7% <ø> (+7%) ⬆️
#collector 57.4% <ø> (-5.6%) ⬇️
#dbnode 74.7% <ø> (+25.5%) ⬆️
#m3em 11.8% <ø> (-48.5%) ⬇️
#m3ninx 30.7% <0%> (-34.1%) ⬇️
#m3nsch 17.7% <0%> (-28.1%) ⬇️
#metrics 17.7% <0%> (-0.1%) ⬇️
#msg 74.7% <ø> (+12.6%) ⬆️
#query 56.3% <0%> (+5.7%) ⬆️
#x 59.8% <ø> (+9.8%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d49308...83727d9. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 4, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@ec5ee86). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##             master   #2036   +/-   ##
========================================
  Coverage          ?   72.2%           
========================================
  Files             ?    1015           
  Lines             ?   87399           
  Branches          ?       0           
========================================
  Hits              ?   63186           
  Misses            ?   19979           
  Partials          ?    4234
Flag Coverage Δ
#aggregator 82% <ø> (?)
#cluster 85.6% <ø> (?)
#collector 64.8% <ø> (?)
#dbnode 79.7% <ø> (?)
#m3em 73.2% <ø> (?)
#m3ninx 73.9% <ø> (?)
#m3nsch 51.1% <ø> (?)
#metrics 17.6% <ø> (?)
#msg 74.9% <ø> (?)
#query 68.2% <ø> (?)
#x 83.1% <ø> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec5ee86...c623742. Read the comment docs.

}

// Rule returns the mapping rule for the mapping rule configuration.
func (r MappingRuleConfiguration) Rule() (view.MappingRule, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pull this out into separate file rather than in options?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still might be a good idea (it may already be moved and my gh is cached or something though)

@@ -191,6 +207,287 @@ type Configuration struct {
BufferPastLimits []BufferPastLimitConfiguration `yaml:"bufferPastLimits"`
}

// RulesConfiguration is a set of rules configuration to use for downsampling.
type RulesConfiguration struct {
// MappingRules are the mapping rules that sets retention and resolution
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...that set retention

// for metrics given a filter to match metrics against.
MappingRules []MappingRuleConfiguration `yaml:"mappingRules"`

// RollupRules are the rollup rules that sets specific aggregations for sets
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RollupRules are rollup rules...

// - "P9999"
Aggregations []aggregation.Type `yaml:"aggregations"`

// StoragePolicies is the retention/resolution storage policies to keep the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StoragePolicies are the retention/resolution storage policies at which to keep matched metrics.

// keeping them with a storage policy.
Drop bool `yaml:"drop"`

// Optional fields follows.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional fields follow.


// StoragePolicy returns the corresponding storage policy.
func (p StoragePolicyConfiguration) StoragePolicy() (policy.StoragePolicy, error) {
return policy.ParseStoragePolicy(p.Resolution.String() + ":" + p.Retention.String())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe Sprintf instead of concat here?


// StoragePolicies returns storage policies.
func (p StoragePolicyConfigurations) StoragePolicies() (policy.StoragePolicies, error) {
var storagePolicies policy.StoragePolicies
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: init with size


// RollupRuleConfiguration is a rollup rule configuration.
type RollupRuleConfiguration struct {
// Filter is a string separated filter of labe name to label value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter is a space separated filter of label name to label value glob patterns to which to filter the mapping rule.

// Transforms are a set of of rollup rule transforms.
Transforms []TransformConfiguration `yaml:"transforms"`

// StoragePolicies is the retention/resolution storage policies to keep the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StoragePolicies are the retention/resolution storage policies at which to keep the matched metrics

// matched metrics at.
StoragePolicies []StoragePolicyConfiguration `yaml:"storagePolicies"`

// Optional fields follows.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional fields follow.

// RollupOperationConfiguration is a rollup operation.
type RollupOperationConfiguration struct {
// MetricName is the name of the new metric that is emitted after
// the rollup is applied with it's aggregations and group by's.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with its aggregations and group by's.

return view.RollupRule{}, err
}

var ops []pipeline.OpUnion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Init with size, also is it worth doing a validation step to ensure that there's not multiple things set?

Copy link
Collaborator

@arnikola arnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bunch of nits but nothing worth blocking for, approving if you want to address async

@robskillington robskillington force-pushed the r/mapping-rule-rollup-rule-config-support branch from 5f7bc6b to 2da1b2f Compare January 2, 2020 15:39
@anandsinghkunwar
Copy link

@robskillington is this supposed to make the next release?

@robskillington robskillington changed the title Add mapping rule and rollup rule config support [coordinator] Add mapping rule config support for keeping metrics at different resolutions Jan 2, 2020
@robskillington
Copy link
Collaborator Author

@anandsinghkunwar yes, however we may end up only supporting mapping rules at first. Rollup rules may take some more time as there is some intricacies in performing rollups on metrics with pre-existing timestamps.

Mapping rules allows for selecting metrics to store in different namespaces selectively however and should be easy to merge and have supported for next release. e.g.:

downsample:
  rules:
    mappingRules:
      - name: "nginx metrics"
        filter: "app:nginx*"
        aggregations: ["Last"]
        storagePolicies:
          - resolution: 10m
            retention: 4320h # 180 days

# scripts/docker-integration-tests/repair/test.sh
# scripts/docker-integration-tests/replication/test.sh
# scripts/docker-integration-tests/repair_and_replication/test.sh
# scripts/docker-integration-tests/multi_cluster_write/test.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should uncomment these

@@ -114,6 +114,9 @@ func (s *store) WriteAll(nss *rules.Namespaces, rs rules.MutableRuleSet) error {
if err != nil {
return err
}

r, _ := rs.RollupRules()
fmt.Println("ruleset rollup rules: ", r)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this? Or actually might be better to replace with a debug level log?

Comment on lines 50 to 64
# rollupRules:
# - filter: "foo:bar"
# transforms:
# - rollup:
# metricName: "new_metric"
# aggregations:
# - Sum
# groupBy:
# - foo
# # aggregate:
# # type: Sum
# storagePolicies:
# - retention: 10h
# resolution: 5s
# name: "testRollup"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be uncommented for the test?

@@ -170,6 +183,10 @@ type agg struct {

// Configuration configurates a downsampler.
type Configuration struct {
// Rules is a set of downsample rules, if this rules configuration is set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might read a little clearer as // Rules is a set of downsample rules. If set, this overrides any rules set in the KV store.

status=$(echo $out | grep -v promremotecli_log | docker run --rm -i $JQ_IMAGE jq .statusCode)
if [[ "$success" != "$expect_success" ]]; then
echo $expect_success_err
sleep 10000000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be sleep 1? Otherwise it's ~115 days


ops := make([]pipeline.OpUnion, 0, len(r.Transforms))
for _, elem := range r.Transforms {
// TODO: make sure only one of "Rollup" or "Aggregate" or "Transform" is not nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; This may be a bit cleaner as a method on TransformConfiguration?

Comment on lines +384 to +385
NewName: cfg.MetricName,
Tags: cfg.GroupBy,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this ensure that MetricName is not in GroupBy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ok.


// TransformOperationConfiguration is a transform operation.
type TransformOperationConfiguration struct {
// Type is an transformation operation type.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a transformation...

Comment on lines 1018 to 1020
func (c *aggregatorLocalAdminClient) SetAggregator(agg aggregator.Aggregator) {
c.agg = agg
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can probably roll this into newAggregatorLocalAdminClient ? Also doesn't seem to need exporting

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unexported it but kept it since // NB: Can't do this at construction time since needs to be passed as an // option to the aggregator constructor.

SetRuleSetOptions(ruleSetOpts).
SetKVStore(rulesKVStore)

// NB(r): If rules are being explicitlly set in config then we are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; explicitlly -> explicitly

true "Expected request to succeed" \
200 "Expected request to return status code 200"

start=$(expr $(date +"%s") - 3600)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: define end first, then subtract from that, just to avoid any potential issues with crossing any boundaries?

@arnikola
Copy link
Collaborator

arnikola commented Jan 2, 2020

BTW does this need to set namespace ClusterNamespaceDownsampleOptions.All to false for any not matching namespaces? Otherwise query fanout may miss points

@robskillington robskillington merged commit a24c337 into master Jan 3, 2020
@robskillington robskillington deleted the r/mapping-rule-rollup-rule-config-support branch January 3, 2020 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants