Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Add query ID generation scheme sections to yaml files and user guides #1381

Merged
merged 5 commits into from
Feb 16, 2019

Conversation

arnikola
Copy link
Collaborator

No description provided.

@@ -51,6 +51,25 @@ You will notice that in the setup linked above, M3DB has just one unaggregated n
all: false
```

## ID generation

The default generation scheme for IDs is unfortunately prone to collisions, but is left as is for back-compat reasons. It is suggested to set the ID generation scheme to one of either "quoted" or "prepend_meta". Quoted generation scheme yields the most human-readable IDs, whereas Prepend_Meta is better for more compact IDS, or if tags are expected to contain non-ASCII characters. To set the ID generation scheme, add the following to your coordinator configuration yaml file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we change the default to not be fucked? What kind of issues could it cause?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that would be the ideal case, unfortunately any current users (especially those with a ton of series) will see double incoming metrics

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels wrong to have the default be fucked for b/w compat. Would be better to break compat but expect users to change the default to non-sane because we'll publish guides/release notes/etc.

Or we'll always be stuck in this place.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, the default case is not too bad unless there's weird cases that collide like:

[{t1:v1},{t2:v2}] -> t1=v1,t2=v2,
[{t1:v1,t2:v2}] -> t1=v1,t2=v2,

I doubt users are jamming , and = characters into their tag name/values, but the alternative generation schemes will cover that case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's kinda lame at the moment; I guess with the number of users we currently have, it's still manageable to spread awareness that their metric count will double and let them know they can explicitly set the default case to legacy if that presents an untenable situation

Copy link
Contributor

@andrewmains12 andrewmains12 Feb 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still manageable to spread awareness that their metric count will double and let them know they can explicitly set the default case to legacy if that presents an untenable situation

Yeah, if we're going to bite that bullet, it's not going to get better if we wait. Could we provide any data rewrite tooling to upgrade folks' existing data to the new scheme? Not sure if it's worth it (and/or if we have a good way to do that--map/reduce style primitives would be nice here), but that would be the way to go if we wanted to ease the pain a bit. Then again, I'm assuming any data rewrite would probably imply either downtime or some tricky code on our part to avoid downtime so... maybe best to just have double the metrics for a bit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to go a way; explicitly setting scheme in config is now necessary in the new version, with no default values. If users are happy to take on the cost of doubled cardinality, they can elect to choose a different scheme than legacy.

After some time we'll relax the condition that scheme is specified, and default to quoted (probably). This way, current users who do not want to take on the extra cardinality are able to set their configs correctly for any future updates.


nil: t1=v1,t2=v2,t3=v3,t4=v4,
prepend_meta: 2,2,2,2,2,2,2,2!t1v1t2v2t3v3t4v4
quoted: {t1="v1",t2="v2",t3="v3",t4="v4"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you do if tag name/value includes quote characters themselves? i.e. is there escaping too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call will call out escaping specifically

@codecov
Copy link

codecov bot commented Feb 14, 2019

Codecov Report

Merging #1381 into master will decrease coverage by 0.1%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1381     +/-   ##
========================================
- Coverage    70.7%   70.6%   -0.2%     
========================================
  Files         827     826      -1     
  Lines       71243   71047    -196     
========================================
- Hits        50401   50178    -223     
- Misses      17540   17571     +31     
+ Partials     3302    3298      -4
Flag Coverage Δ
#aggregator 82.3% <ø> (-0.1%) ⬇️
#cluster 85.9% <ø> (ø) ⬆️
#collector 63.7% <ø> (ø) ⬆️
#dbnode 80.6% <ø> (-0.1%) ⬇️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.2% <ø> (ø) ⬆️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 74.9% <ø> (ø) ⬆️
#query 64.7% <ø> (-0.5%) ⬇️
#x 76% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b5d849...9c2bd3a. Read the comment docs.

3 similar comments
@codecov
Copy link

codecov bot commented Feb 14, 2019

Codecov Report

Merging #1381 into master will decrease coverage by 0.1%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1381     +/-   ##
========================================
- Coverage    70.7%   70.6%   -0.2%     
========================================
  Files         827     826      -1     
  Lines       71243   71047    -196     
========================================
- Hits        50401   50178    -223     
- Misses      17540   17571     +31     
+ Partials     3302    3298      -4
Flag Coverage Δ
#aggregator 82.3% <ø> (-0.1%) ⬇️
#cluster 85.9% <ø> (ø) ⬆️
#collector 63.7% <ø> (ø) ⬆️
#dbnode 80.6% <ø> (-0.1%) ⬇️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.2% <ø> (ø) ⬆️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 74.9% <ø> (ø) ⬆️
#query 64.7% <ø> (-0.5%) ⬇️
#x 76% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b5d849...9c2bd3a. Read the comment docs.

@codecov
Copy link

codecov bot commented Feb 14, 2019

Codecov Report

Merging #1381 into master will decrease coverage by 0.1%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1381     +/-   ##
========================================
- Coverage    70.7%   70.6%   -0.2%     
========================================
  Files         827     826      -1     
  Lines       71243   71047    -196     
========================================
- Hits        50401   50178    -223     
- Misses      17540   17571     +31     
+ Partials     3302    3298      -4
Flag Coverage Δ
#aggregator 82.3% <ø> (-0.1%) ⬇️
#cluster 85.9% <ø> (ø) ⬆️
#collector 63.7% <ø> (ø) ⬆️
#dbnode 80.6% <ø> (-0.1%) ⬇️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.2% <ø> (ø) ⬆️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 74.9% <ø> (ø) ⬆️
#query 64.7% <ø> (-0.5%) ⬇️
#x 76% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b5d849...9c2bd3a. Read the comment docs.

@codecov
Copy link

codecov bot commented Feb 14, 2019

Codecov Report

Merging #1381 into master will decrease coverage by 0.1%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1381     +/-   ##
========================================
- Coverage    70.7%   70.6%   -0.2%     
========================================
  Files         827     826      -1     
  Lines       71243   71047    -196     
========================================
- Hits        50401   50178    -223     
- Misses      17540   17571     +31     
+ Partials     3302    3298      -4
Flag Coverage Δ
#aggregator 82.3% <ø> (-0.1%) ⬇️
#cluster 85.9% <ø> (ø) ⬆️
#collector 63.7% <ø> (ø) ⬆️
#dbnode 80.6% <ø> (-0.1%) ⬇️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.2% <ø> (ø) ⬆️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 74.9% <ø> (ø) ⬆️
#query 64.7% <ø> (-0.5%) ⬇️
#x 76% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b5d849...9c2bd3a. Read the comment docs.

@codecov
Copy link

codecov bot commented Feb 14, 2019

Codecov Report

Merging #1381 into master will decrease coverage by 0.1%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #1381     +/-   ##
========================================
- Coverage    70.7%   70.6%   -0.2%     
========================================
  Files         827     826      -1     
  Lines       71243   71047    -196     
========================================
- Hits        50401   50178    -223     
- Misses      17540   17571     +31     
+ Partials     3302    3298      -4
Flag Coverage Δ
#aggregator 82.3% <ø> (-0.1%) ⬇️
#cluster 85.9% <ø> (ø) ⬆️
#collector 63.7% <ø> (ø) ⬆️
#dbnode 80.6% <ø> (-0.1%) ⬇️
#m3em 73.2% <ø> (ø) ⬆️
#m3ninx 74.2% <ø> (ø) ⬆️
#m3nsch 51.1% <ø> (ø) ⬆️
#metrics 17.6% <ø> (ø) ⬆️
#msg 74.9% <ø> (ø) ⬆️
#query 64.7% <ø> (-0.5%) ⬇️
#x 76% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b5d849...38bd2a6. Read the comment docs.

Now config fails validation if a scheme has not been set, and added
a migration guide to account for this.
@@ -51,6 +51,38 @@ You will notice that in the setup linked above, M3DB has just one unaggregated n
all: false
```

## ID generation

The default generation scheme for IDs is unfortunately prone to collisions, but is left as is for back-compat reasons. It is suggested to set the ID generation scheme to one of either "quoted" or "prepend_meta". Quoted generation scheme yields the most human-readable IDs, whereas Prepend_Meta is better for more compact IDS, or if tags are expected to contain non-ASCII characters. To set the ID generation scheme, add the following to your coordinator configuration yaml file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"but remains the default for backwards compatibility reasons"

also in one case you capitalize prepend_meta and in the other you don't. Use something like prepend_meta everywhere and spell it exactly how it should be in the config YAML

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call 👍

idScheme: <name>
```

As an example of how these schemes generate tags, consider a series with the following 4 tags,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate tags -> generate ids

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

quoted: {\"t1\"="v1",t2="\"v2\"",t3="v3",t4="v4"}
```

If there is a chance that your metric tags will contain "control" characters, specifically `,` and `=`, it is highly recommended that one of either the `quoted` or `prepend_meta` schemes are specified, as the `legacy` scheme will be prone to ID collisions. As a general guideline, we suggest `quoted`, as it mirrors the more familiar Prometheus style IDs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be prone to -> may cause


We technically have a fourth ID generation scheme that is used for Graphite IDs, but it is exclusive to the Graphite ingestion path and is not selectable as a general scheme.

**WARNING:** Once a scheme is selected and used, be very careful about switching it, as any incoming metrics will get an ID matching the new scheme instead, effectively doubling the metric count until any of the older metric IDs rotate out of their retention period.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Once a scheme is selected, be very careful about changing it. If changed, all incoming metrics will resolved to a new ID, effectively doubling the metric cardinality until all of the older-style metric IDs fall out of retention"


### Migration

We have recently updated our ID generation scheme in coordinator, as we wanted to avoid the collision issues as mentioned above. Unfortunately, that can cause issues along the upgrade path, and to account for this, we're enforcing an explicitly required ID scheme generation option to be added to coordinator configuration files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coordinator -> M3Coordinator or m3coordinator

"We recently updated our ID generation schema in M3Coordinator to avoid the collision issues discussed above. To ease migration, we're temporarily enforcing that an ID generate scheme be explicitly provided in the M3Coordinator configuration files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe add, "In a future release we will lift this restriction and make the quoted scheme the default if none is provided in the YAML configuration"


If you have been running m3query or m3coordinator already, you may want to counterintuitively select the collision-prone `quoted` scheme, as all the IDs for all of your current metrics would have already been generated with this scheme, and choosing another will effectively double your index size. If the twofold increase in cardinality is an acceptable increase (and unfortunately, this is likely to mean doubled cardinality until your longest retention cluster rotates out), it's suggested to choose a collision-resistant scheme instead.

If none of these options work for you, or you would like further clarification, please stop by our [gitter channel](https://gitter.im/m3db/Lobby) and we should be able to help you.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should -> we'll be happy to


We have recently updated our ID generation scheme in coordinator, as we wanted to avoid the collision issues as mentioned above. Unfortunately, that can cause issues along the upgrade path, and to account for this, we're enforcing an explicitly required ID scheme generation option to be added to coordinator configuration files.

If you have been running m3query or m3coordinator already, you may want to counterintuitively select the collision-prone `quoted` scheme, as all the IDs for all of your current metrics would have already been generated with this scheme, and choosing another will effectively double your index size. If the twofold increase in cardinality is an acceptable increase (and unfortunately, this is likely to mean doubled cardinality until your longest retention cluster rotates out), it's suggested to choose a collision-resistant scheme instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean legacy instead of quoted yeah?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yeah

Copy link
Contributor

@richardartoul richardartoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -340,7 +345,9 @@ func TagOptionsFromConfig(cfg TagOptionsConfiguration) (models.TagOptions, error
}

if cfg.Scheme == models.TypeDefault {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: Would it be cleaner to just check if this was nil?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think when we take it to step 2 and make this optional would definitely check for nil instead

@arnikola arnikola merged commit 41c5bce into master Feb 16, 2019
@arnikola arnikola deleted the arnikola/config-id-schemes branch February 16, 2019 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants