From e40191601ddb490a5f2b0738b4854e74ff0aee12 Mon Sep 17 00:00:00 2001 From: Josh MacDonald Date: Tue, 27 Aug 2019 01:29:57 -0700 Subject: [PATCH 01/24] Updates to 0003 following work session 8/21/2019 --- text/0003-measure-metric-type.md | 132 ++++++++++++++++++++++++++----- 1 file changed, 114 insertions(+), 18 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 916e67dd4..3262f4161 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -1,8 +1,8 @@ # Consolidate pre-aggregated and raw metrics APIs -**Status:** `proposed` +**Status:** `accepted` -## Forward +# Forward This propsal was originally split into three semi-related parts. Based on the feedback, they are now combined here into a single proposal. The original proposals were: @@ -10,39 +10,135 @@ This propsal was originally split into three semi-related parts. Based on the fe 000x-metric-measure 000x-eliminate-stats-record -## Overview +### Updated 8/23/2019 -Introduce a `Measure` type of metric object that supports a `Record` API. Like existing `Gauge` and `Cumulative` metrics, the new `Measure` metric supports pre-defined labels. A new measurement batch API is introduced for recording multiple metric observations simultaneously. +A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs (0003 and 0004) and several surrounding concerns. This document has been revised with related updates that were agreed upon during this working session. See the (meeting notes)[https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#]. -## Motivation +# Overview -In the current `Metric.GetOrCreateTimeSeries` API for Gauges and Cumulatives, the caller obtains a `TimeSeries` handle for repeatedly recording metrics with certain pre-defined label values set. This is an important optimization, especially for exporting aggregated metrics. +Introduce a `Measure` kind of metric object that supports a `Record` API method. Like existing `Gauge` and `Cumulative` metrics, the new `Measure` metric supports pre-defined labels. A new measurement batch API is introduced for recording multiple metric observations simultaneously. -The use of pre-defined labels improves usability too, for working with metrics in code. Application programs with long-lived objects and associated Metrics can compute predefined label values once (e.g., in a constructor), rather than once per call site. +## Terminology -The current raw statistics API does not support pre-defined labels. This RFC replaces the raw statistics API by a new, general-purpose type of metric, `MeasureMetric`, generally intended for recording individual measurements the way raw statistics did, with added support for pre-defined labels. +This RFC changes how "Measure" is used in the OpenTelemetry metrics specification. Before, "Measure" was the name of a series of raw measurements. After, "Measure" is the kind of a metric object used for recording a series raw measurements. -The former raw statistics API supported all-or-none recording for interdependent measurements. This RFC introduces a `MeasurementBatch` to support recording batches of metric observations. +Since this document will be read in the future after the proposal has been written, uses of the word "current" lead to confusion. For this document, the term "preceding" refers to the state that was current prior to these changes. -## Explanation +# Motivation -In the current proposal, Metrics are used for pre-aggregated metric types, whereas Raw statistics are used for uncommon and vendor-specific aggregations. The optimization and the usability advantages gained with pre-defined labels should be extended to Raw statistics because they are equally important and equally applicable. This is a new requirement. +In the preceding `Metric.GetOrCreateTimeSeries` API for Gauges and Cumulatives, the caller obtains a `TimeSeries` handle for repeatedly recording metrics with certain pre-defined label values set. This enables an important optimization for exporting pre-aggregated metrics, since the implementation is able to compute the aggregate summary "entry" using a pointer or fast table lookup. The efficiency gain requires that the aggregation keys be a subset of the pre-defined labels. -For example, where the application wants to compute a histogram of some value (e.g., latency), there's good reason to pre-aggregate such information. In this example, it allows an implementation to effienctly export the histogram of latencies "grouped" into individual results by label value(s). +Application programs with long-lived objects and associated Metrics can take advantage of pre-defined labels by computing label values once per object (e.g., in a constructor), rather than once per call site. In this way, the use of pre-defined labels improves the usability of the API as well as makes an important optimization possible to the implementation. -The new `MeasureMetric` API satisfies the requirements of a single-argument call to record raw statistics, but the raw statistics API had secondary purpose, that of supporting recording multiple observed values simultaneously. This proposal introduces a `MeasurementBatch` API to record multiple metric observations in a single call. +The preceding raw statistics API did not specify support for pre-defined labels. This RFC replaces the raw statistics API by a new, general-purpose kind of metric, `MeasureMetric`, generally intended for recording individual measurements like the preceding raw statistics API, with explicit support for pre-defined labels. -## Internal details +The preceding raw statistics API supported all-or-none recording for interdependent measurements. This RFC introduces a `RecordBatch` API to support recording batches of measurements in a single API call, where a `Measurement` is now defined as a tuple of `MeasureMetric`, `Value` (integer or floating point), and `Labels`. -The type known as `MeasureMetric` is a direct replacement for the raw statistics `Measure` type. The `MeasureMetric.Record` method records a single observation of the metric. The `MeasureMetric.GetOrCreateTimeSeries` supports pre-defined labels, just the same as `Gauge` and `Cumulative` metrics. +# Explanation -## Trade-offs and mitigations +The common use for `MeasureMetric`, like the preceding raw statistics API, is for reporting information about rates and distributions over structured, numerical event data. Measure metrics are the most general-purpose of metrics. Informally, the individual metric event has a logical format expressed as one primary key=value (the metric name and a numerical value) and any number of secondary key=values (the labels, resources, and other context). -This Measure Metric API is conceptually close to the Prometheus [Histogram, Summary, and Untyped metric types](https://prometheus.io/docs/concepts/metric_types/), but there is no way in OpenTelemetry to distinguish these cases at the declaration site, in code. This topic is covered in 0004-metric-configurable-aggregation. + metric_name=_number_ + pre_defined1=_any_value_ + pre_defined2=_any_value_ + ... + resource1=_any_value_ + resource2=_any_value_ + ... + context_tag1=_any_value_ + context_tag2=_any_value_ + ... + +Events of this form can logically capture a single update to a named metric, whether a cumulative, gauge, or measure kind of metric. This logical structure defines a _low-level encoding_ of any metric event, across the three kinds of metric. This establishes the separation between the metrics API and implementation required for OpenTelemetry. An SDK could simply encode a stream of these events and the consumer, provided access to the metric definition, should be able to interpret these events according to the semantics prescribed for each kind of metric. + +## Metrics API concepts + +The `Meter` interface represents the metrics portion of the OpenTelemetry API. + +There are three kinds of metric, `CumulativeMetric`, `GaugeMetric`, and `MeasureMetric`. + +Metric objects are declared and defined independently of the SDK. They may be statically defined, as opposed to allocated through the SDK in any way. To define a new metric, use one of the `NewCumulativeMetric`, `NewGaugeMetric`, or `NewMeasureMetric` methods. + +Each metric is declared with a list (possibly empty) of pre-defined label keys. These pre-defined label keys declare the set of keys that are available as dimensions for efficient pre-aggregation. + +To obtain a metric _handle_ from a metric object, call `getHandle` with the pre-defined label values. There are two ways to pass the pre-defined label values: + +1. As an ordered list of values. In this case, the number of arguments in the list must match the number of pre-defined label keys. When the number of arguments disagrees with the metric definition, the implementation may return an error or thrown an exception to synchronously indicate this condition. +2. As a list of key:value pairs. In this case, the application is free to provide the list of label values in arbitrary order. Values that are not passed when constructing handles in this way are marked as "not present". Values that are not part of the pre-defined label keys are ignored when constructing handles. + +Metric handles, thusly obtained with one of the `getHandle` variations, may be used to `Set()`, `Add()`, and `Record()` metrics according to their kind. Context tags that apply when calling `Set()`, `Add()`, and `Record()` may not override values that were set in the handle as pre-defined labels. + +## Selecting Metric Kind + +By the "separation clause" of OpenTelemetry, we know that an implementation is free to do _anything_ in response to a metric API call. By the low-level interpretation defined above, all metric events have the same structural representation, only their logical interpretation varies according to the metric definition. Therefore, we select metric kinds based on two primary concerns: + +1. What should be the default implementation behavior? Unless configured otherwise, how should the implementation treat this metric variable? +1. How will the program read? Each metric uses a different verb, which helps convey meaning and describe default behavior. Cumulatives have an `Add()` method. Gauges have a `Set()` method. Measures have a `Record()` method. + +To guide the user in selecting the right kind of metric for an application, we'll consider the following questions about the primary intent of reporting given data. We use "of primary interest" here to mean information that is almost certainly useful in understanding system behavior. Consider these questions: + +- Does the measurement represent a quantity of something? Is it also non-negative? +- Is the sum a matter of primary interest? +- Is the event count of primary interest? +- Is the distribution (p50, p99, etc.) a matter of primary interest? + +The specification will be updated with the following guidance. + +### Cumulative metric + +Likely to be the most common kind of metric, cumulative metric events express the computation of a sum. Choose this kind of metric when the value is a quantity, the sum is of primary interest, and the event count and distribution are not of primary interest. To raise (or lower) a cumulative metric, call the `Add()` method. + +If the quantity in question is always non-negative, it implies that the sum is strictly ascending. When this is the case, the cumulative metric also serves to define a rate. For this reason, cumulative metrics have an option to be declared as non-negative. The API will reject negative updates to non-negative cumulative metrics, instead submitting an SDK error event, which helps ensure meaningful rate calculations. + +For cumulative metrics, the default OpenTelemetry implementation exports the sum of event values taken over an interval of time. + +### Gauge metric + +Gauge metrics express a pre-calculated value that is either `Set()` by explicit instrumentation or observed through a callback. Generally, this kind of metric should be used when the metric cannot be expressed as a sum or a rate because the measurement interval is arbitrary. Use this kind of metric when the measurement is not a quantity, and the sum and event count are not of interest. + +Only the gauge kind of metric supports observing the metric via a callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit distributed context. The implementation is free to associate the explicit `Set()` event with a span context, for example. When observing gauge metrics via a callback, there is no distributed context associated with the event. + +As a special case, to support existing metrics infrastructure, a gauge metric may be declared as a precomputed sum, in which case it is defined as strictly ascending. The API will reject descending updates to strictly-ascending gauges, instead submitting an SDK error event. + +For gauge metrics, the default OpenTelemetry implementation exports the last value that was `Set()`. If configured for an observer callback instead, the default OpenTelemetry implementation exports Observed at the time of metrics collection. + +### Measure metric + +Measure metrics express a distribution of values. This kind of metric should be used when the count of events is meaningful and either: + +1. The sum is of interest in addition to the count +1. Quantiles information is of interest. + +The key property of a measure metric event is that two events cannot be trivially reduced into one, as a step in pre-aggregation. For cumulatives and gauges, two `Add()` or `Set()` events can be replaced by a single event (for default behavior, i.e., unless the implementation is configured differently), whereas two `Record()` events must by reflected in two events. + +Like cumulative metrics, non-negative measures are an important case because they support rate calculations. As an option, measure metrics may be declared as non-negative. The API will reject negative metric events for non-negative measures, instead submitting an SDK error event. + +For measure metrics, the default OpenTelemetry implementation is left up to the implementation. The default interpretation is that the distribution should be summarized, somehow, but the specific technique used belongs to the implementation. A low-cost policy is selected as the default behavior for export OpenTelemetry measures: + +- For non-negative measure metrics, unless otherwise configured, the default implementation exports the sum, the count, and the maximum value as three separate summary variables. +- For arbitrary measure metrics, unless otherwise configured, the default implementation exports the sum, the count, the minimum, and the maximum value as four separate summary variables. + +### Disable selected metrics by default + +All OpenTelemetry metrics may be disabled by default, as an option. Use this option to indicate that the default implementation should be to do nothing for events about this metric. + +### RecordBatch API + +Applications sometimes want to record multiple metrics in a single API call, either becase the values are inter-related or because it lowers overhead. We agree that recording batch measurements will be restricted to measure metrics, although this support could be extended to all kinds of metric in the future. + +Logically, a measurement is defined as: + +- Measure metric: which metric is being updated +- Value: a floating point or integer +- Pre-defined label values: associated via metrics API handle + +The batch measurement API shall be named `RecordBatch`. The entire batch of measurements takes place within some (implicit or explicit) distributed context. ## Prior art and alternatives -Prometheus supports the notion of vector metrics, which are those which support pre-defined labels. The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetOrCreateTimeSeries` in OpenTelemetry. As in this proposal, Prometheus supports a vector API for all metric types. +Prometheus supports the notion of vector metrics, which are those that support pre-defined labels. The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetOrCreateTimeSeries` in OpenTelemetry. As in this proposal, Prometheus supports a vector API for all metric types. + +Statsd libraries generally report metric events individually. To implement statsd reporting from the OpenTelemetry, a `Meter` SDK would be installed that converts metric events into statsd updates. ## Open questions From c4a7836225e9e10cee0427a44f6f9b7905a6d75f Mon Sep 17 00:00:00 2001 From: Josh MacDonald Date: Tue, 27 Aug 2019 01:32:25 -0700 Subject: [PATCH 02/24] Update date --- text/0003-measure-metric-type.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 3262f4161..5a5ddde2b 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -10,7 +10,7 @@ This propsal was originally split into three semi-related parts. Based on the fe 000x-metric-measure 000x-eliminate-stats-record -### Updated 8/23/2019 +### Updated 8/27/2019 A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs (0003 and 0004) and several surrounding concerns. This document has been revised with related updates that were agreed upon during this working session. See the (meeting notes)[https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#]. From beff7cf078bf6ebebca7676b500f07669e0b3b32 Mon Sep 17 00:00:00 2001 From: Josh MacDonald Date: Tue, 27 Aug 2019 15:02:35 -0700 Subject: [PATCH 03/24] Feedback applied --- text/0003-measure-metric-type.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 5a5ddde2b..005c8b746 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -1,8 +1,8 @@ # Consolidate pre-aggregated and raw metrics APIs -**Status:** `accepted` +**Status:** `proposed` -# Forward +# Foreward This propsal was originally split into three semi-related parts. Based on the feedback, they are now combined here into a single proposal. The original proposals were: @@ -36,7 +36,7 @@ The preceding raw statistics API supported all-or-none recording for interdepend # Explanation -The common use for `MeasureMetric`, like the preceding raw statistics API, is for reporting information about rates and distributions over structured, numerical event data. Measure metrics are the most general-purpose of metrics. Informally, the individual metric event has a logical format expressed as one primary key=value (the metric name and a numerical value) and any number of secondary key=values (the labels, resources, and other context). +The common use for `MeasureMetric`, like the preceding raw statistics API, is for reporting information about rates and distributions over structured, numerical event data. Measure metrics are the most general-purpose of metrics. Informally, the individual metric event has a logical format expressed as one primary key=value (the metric name and a numerical value) and any number of secondary key=values (the labels, resources, and context). metric_name=_number_ pre_defined1=_any_value_ @@ -49,6 +49,8 @@ The common use for `MeasureMetric`, like the preceding raw statistics API, is fo context_tag2=_any_value_ ... +Here, "pre_defined" keys are those captured in the metrics handle, "resource" keys are those configured when the SDK was initialized, and "context_tag" keys are those propagated via context. + Events of this form can logically capture a single update to a named metric, whether a cumulative, gauge, or measure kind of metric. This logical structure defines a _low-level encoding_ of any metric event, across the three kinds of metric. This establishes the separation between the metrics API and implementation required for OpenTelemetry. An SDK could simply encode a stream of these events and the consumer, provided access to the metric definition, should be able to interpret these events according to the semantics prescribed for each kind of metric. ## Metrics API concepts @@ -96,7 +98,7 @@ For cumulative metrics, the default OpenTelemetry implementation exports the sum Gauge metrics express a pre-calculated value that is either `Set()` by explicit instrumentation or observed through a callback. Generally, this kind of metric should be used when the metric cannot be expressed as a sum or a rate because the measurement interval is arbitrary. Use this kind of metric when the measurement is not a quantity, and the sum and event count are not of interest. -Only the gauge kind of metric supports observing the metric via a callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit distributed context. The implementation is free to associate the explicit `Set()` event with a span context, for example. When observing gauge metrics via a callback, there is no distributed context associated with the event. +Only the gauge kind of metric supports observing the metric via a callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. As a special case, to support existing metrics infrastructure, a gauge metric may be declared as a precomputed sum, in which case it is defined as strictly ascending. The API will reject descending updates to strictly-ascending gauges, instead submitting an SDK error event. @@ -132,7 +134,7 @@ Logically, a measurement is defined as: - Value: a floating point or integer - Pre-defined label values: associated via metrics API handle -The batch measurement API shall be named `RecordBatch`. The entire batch of measurements takes place within some (implicit or explicit) distributed context. +The batch measurement API shall be named `RecordBatch`. The entire batch of measurements takes place within some (implicit or explicit) context. ## Prior art and alternatives From 0aa96656d9b65a9cb328aa22f05e833e258c3572 Mon Sep 17 00:00:00 2001 From: Josh MacDonald Date: Tue, 27 Aug 2019 15:04:06 -0700 Subject: [PATCH 04/24] Feedback applied --- text/0003-measure-metric-type.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 005c8b746..7507f86bd 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -59,7 +59,7 @@ The `Meter` interface represents the metrics portion of the OpenTelemetry API. There are three kinds of metric, `CumulativeMetric`, `GaugeMetric`, and `MeasureMetric`. -Metric objects are declared and defined independently of the SDK. They may be statically defined, as opposed to allocated through the SDK in any way. To define a new metric, use one of the `NewCumulativeMetric`, `NewGaugeMetric`, or `NewMeasureMetric` methods. +Metric objects are declared and defined independently of the SDK. They may be statically defined, as opposed to allocated through the SDK in any way. To define a new metric, use one of the language-specific API methods (e.g., with names like `NewCumulativeMetric`, `NewGaugeMetric`, or `NewMeasureMetric`). Each metric is declared with a list (possibly empty) of pre-defined label keys. These pre-defined label keys declare the set of keys that are available as dimensions for efficient pre-aggregation. From 7d5bee59e76094e8297c93ac979e8e3544e36a98 Mon Sep 17 00:00:00 2001 From: jmacd Date: Sun, 1 Sep 2019 20:52:46 -0700 Subject: [PATCH 05/24] Remove handle specification, will create another RFC --- text/0003-measure-metric-type.md | 34 +++++++++++++++----------------- 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 7507f86bd..8092339b2 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -12,7 +12,7 @@ This propsal was originally split into three semi-related parts. Based on the fe ### Updated 8/27/2019 -A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs (0003 and 0004) and several surrounding concerns. This document has been revised with related updates that were agreed upon during this working session. See the (meeting notes)[https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#]. +A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs (0003 and 0004) and several surrounding concerns. This document has been revised with related updates that were agreed upon during this working session. See the [meeting notes](https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#). # Overview @@ -51,24 +51,25 @@ The common use for `MeasureMetric`, like the preceding raw statistics API, is fo Here, "pre_defined" keys are those captured in the metrics handle, "resource" keys are those configured when the SDK was initialized, and "context_tag" keys are those propagated via context. -Events of this form can logically capture a single update to a named metric, whether a cumulative, gauge, or measure kind of metric. This logical structure defines a _low-level encoding_ of any metric event, across the three kinds of metric. This establishes the separation between the metrics API and implementation required for OpenTelemetry. An SDK could simply encode a stream of these events and the consumer, provided access to the metric definition, should be able to interpret these events according to the semantics prescribed for each kind of metric. +Events of this form can logically capture a single update to a named metric, whether a cumulative, gauge, or measure kind of metric. This logical structure defines a _low-level encoding_ of any metric event, across the three kinds of metric. An SDK could simply encode a stream of these events and the consumer, provided access to the metric definition, should be able to interpret these events according to the semantics prescribed for each kind of metric. ## Metrics API concepts The `Meter` interface represents the metrics portion of the OpenTelemetry API. -There are three kinds of metric, `CumulativeMetric`, `GaugeMetric`, and `MeasureMetric`. +There are three kinds of metric instrument, `CumulativeMetric`, `GaugeMetric`, and `MeasureMetric`. -Metric objects are declared and defined independently of the SDK. They may be statically defined, as opposed to allocated through the SDK in any way. To define a new metric, use one of the language-specific API methods (e.g., with names like `NewCumulativeMetric`, `NewGaugeMetric`, or `NewMeasureMetric`). +Metric instruments are constructed by the API, they are not constructed by any specific SDK. -Each metric is declared with a list (possibly empty) of pre-defined label keys. These pre-defined label keys declare the set of keys that are available as dimensions for efficient pre-aggregation. +| Name | A string. | +| Kind | One of Cumulative, Gauge, or Measure. | +| Keys | List of always-defined keys in handles for this metric. | +| Unit | The unit of measurement being recorded. | +| Description | Information about this metric. | -To obtain a metric _handle_ from a metric object, call `getHandle` with the pre-defined label values. There are two ways to pass the pre-defined label values: +See the specification for more information on these fields, including formatting and uniqueness requirements. To define a new metric, use one of the language-specific API methods (e.g., with names like `NewCumulativeMetric`, `NewGaugeMetric`, or `NewMeasureMetric`). -1. As an ordered list of values. In this case, the number of arguments in the list must match the number of pre-defined label keys. When the number of arguments disagrees with the metric definition, the implementation may return an error or thrown an exception to synchronously indicate this condition. -2. As a list of key:value pairs. In this case, the application is free to provide the list of label values in arbitrary order. Values that are not passed when constructing handles in this way are marked as "not present". Values that are not part of the pre-defined label keys are ignored when constructing handles. - -Metric handles, thusly obtained with one of the `getHandle` variations, may be used to `Set()`, `Add()`, and `Record()` metrics according to their kind. Context tags that apply when calling `Set()`, `Add()`, and `Record()` may not override values that were set in the handle as pre-defined labels. +Metric instrument Handles are SDK-provided objects that combine a metric instrument with a set of pre-defined labels. Handles are obtained by calling a language-specific API method (e.g., `GetHandle`) on the metric instrument with its label values. Handles may be used to `Set()`, `Add()`, or `Record()` metrics according to their kind. The `Set()`, `Add()`, and `Record()` ## Selecting Metric Kind @@ -81,7 +82,7 @@ To guide the user in selecting the right kind of metric for an application, we'l - Does the measurement represent a quantity of something? Is it also non-negative? - Is the sum a matter of primary interest? -- Is the event count of primary interest? +- Is the event count a matter of primary interest? - Is the distribution (p50, p99, etc.) a matter of primary interest? The specification will be updated with the following guidance. @@ -98,11 +99,11 @@ For cumulative metrics, the default OpenTelemetry implementation exports the sum Gauge metrics express a pre-calculated value that is either `Set()` by explicit instrumentation or observed through a callback. Generally, this kind of metric should be used when the metric cannot be expressed as a sum or a rate because the measurement interval is arbitrary. Use this kind of metric when the measurement is not a quantity, and the sum and event count are not of interest. -Only the gauge kind of metric supports observing the metric via a callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. +Only the gauge kind of metric supports observing the metric via a gauge `Observer` callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. As a special case, to support existing metrics infrastructure, a gauge metric may be declared as a precomputed sum, in which case it is defined as strictly ascending. The API will reject descending updates to strictly-ascending gauges, instead submitting an SDK error event. -For gauge metrics, the default OpenTelemetry implementation exports the last value that was `Set()`. If configured for an observer callback instead, the default OpenTelemetry implementation exports Observed at the time of metrics collection. +For gauge metrics, the default OpenTelemetry implementation exports the last value that was explicitly `Set()`, or if using a callback, the current value from the Observer. ### Measure metric @@ -111,14 +112,11 @@ Measure metrics express a distribution of values. This kind of metric should be 1. The sum is of interest in addition to the count 1. Quantiles information is of interest. -The key property of a measure metric event is that two events cannot be trivially reduced into one, as a step in pre-aggregation. For cumulatives and gauges, two `Add()` or `Set()` events can be replaced by a single event (for default behavior, i.e., unless the implementation is configured differently), whereas two `Record()` events must by reflected in two events. +The key property of a measure metric event is that two events cannot be trivially reduced into one, unlike cumulative and gauge metrics. Two `Record()` events must by reflected in two events because both the event count and the individual values are significant. Like cumulative metrics, non-negative measures are an important case because they support rate calculations. As an option, measure metrics may be declared as non-negative. The API will reject negative metric events for non-negative measures, instead submitting an SDK error event. -For measure metrics, the default OpenTelemetry implementation is left up to the implementation. The default interpretation is that the distribution should be summarized, somehow, but the specific technique used belongs to the implementation. A low-cost policy is selected as the default behavior for export OpenTelemetry measures: - -- For non-negative measure metrics, unless otherwise configured, the default implementation exports the sum, the count, and the maximum value as three separate summary variables. -- For arbitrary measure metrics, unless otherwise configured, the default implementation exports the sum, the count, the minimum, and the maximum value as four separate summary variables. +For measure metrics, the default OpenTelemetry implementation is left up to the implementation. The default interpretation is that the distribution should be summarized, somehow, but the specific technique used belongs to the implementation and the exporter semantics. A low-cost policy is selected as the default behavior for export OpenTelemetry measures: export the sum, the count, the minimum, and the maximum value in the form of a summary. ### Disable selected metrics by default From 2e541db2811c6de25e3d128f0d4c086234322772 Mon Sep 17 00:00:00 2001 From: jmacd Date: Sun, 1 Sep 2019 23:16:29 -0700 Subject: [PATCH 06/24] More typing --- text/0003-measure-metric-type.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 8092339b2..f0b257713 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -73,10 +73,10 @@ Metric instrument Handles are SDK-provided objects that combine a metric instrum ## Selecting Metric Kind -By the "separation clause" of OpenTelemetry, we know that an implementation is free to do _anything_ in response to a metric API call. By the low-level interpretation defined above, all metric events have the same structural representation, only their logical interpretation varies according to the metric definition. Therefore, we select metric kinds based on two primary concerns: +By separation of API and implementation in OpenTelemetry, we know that an implementation is free to do _anything_ in response to a metric API call. By the low-level interpretation defined above, all metric events have the same structural representation, only their logical interpretation varies according to the metric definition. Therefore, we select metric kinds based on two primary concerns: 1. What should be the default implementation behavior? Unless configured otherwise, how should the implementation treat this metric variable? -1. How will the program read? Each metric uses a different verb, which helps convey meaning and describe default behavior. Cumulatives have an `Add()` method. Gauges have a `Set()` method. Measures have a `Record()` method. +1. How will the program source code read? Each metric uses a different verb, which helps convey meaning and describe default behavior. Cumulatives have an `Add()` method. Gauges have a `Set()` method. Measures have a `Record()` method. To guide the user in selecting the right kind of metric for an application, we'll consider the following questions about the primary intent of reporting given data. We use "of primary interest" here to mean information that is almost certainly useful in understanding system behavior. Consider these questions: @@ -107,16 +107,16 @@ For gauge metrics, the default OpenTelemetry implementation exports the last val ### Measure metric -Measure metrics express a distribution of values. This kind of metric should be used when the count of events is meaningful and either: +Measure metrics express a distribution of values. This kind of metric should be used when the count or rate of events is meaningful and either: -1. The sum is of interest in addition to the count -1. Quantiles information is of interest. +1. The sum is of interest in addition to the count (rate) +1. Quantile information is of interest. -The key property of a measure metric event is that two events cannot be trivially reduced into one, unlike cumulative and gauge metrics. Two `Record()` events must by reflected in two events because both the event count and the individual values are significant. +The key property of a measure metric event is that computing quantiles and/or summarizing a distribution (e.g., via a histogram) may be expensive. Not only will implementations have various capabilities and algorithms for this task, users may wish to control the quality and cost of aggregating measure metrics. Like cumulative metrics, non-negative measures are an important case because they support rate calculations. As an option, measure metrics may be declared as non-negative. The API will reject negative metric events for non-negative measures, instead submitting an SDK error event. -For measure metrics, the default OpenTelemetry implementation is left up to the implementation. The default interpretation is that the distribution should be summarized, somehow, but the specific technique used belongs to the implementation and the exporter semantics. A low-cost policy is selected as the default behavior for export OpenTelemetry measures: export the sum, the count, the minimum, and the maximum value in the form of a summary. +Because measure metrics have such wide application, implementations are likely to provide configurable behavior. OpenTelemetry may provide such a facility in its standard SDK, but in case no configuration is provided by the application, a low-cost policy is specified as the default behavior, whic is to export the sum, the count (rate), the minimum value, and the maximum value. ### Disable selected metrics by default @@ -132,16 +132,16 @@ Logically, a measurement is defined as: - Value: a floating point or integer - Pre-defined label values: associated via metrics API handle -The batch measurement API shall be named `RecordBatch`. The entire batch of measurements takes place within some (implicit or explicit) context. +The batch measurement API uses a language-specific method name (e.g., `RecordBatch`). The entire batch of measurements takes place within some (implicit or explicit) context. ## Prior art and alternatives -Prometheus supports the notion of vector metrics, which are those that support pre-defined labels. The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetOrCreateTimeSeries` in OpenTelemetry. As in this proposal, Prometheus supports a vector API for all metric types. +Prometheus supports the notion of vector metrics, which are those that support pre-defined labels for a specific set of Keys. The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetHandle` in OpenTelemetry. As in this proposal, Prometheus supports a vector API for all metric types. Statsd libraries generally report metric events individually. To implement statsd reporting from the OpenTelemetry, a `Meter` SDK would be installed that converts metric events into statsd updates. ## Open questions -Argument ordering has been proposed as the way to pass pre-defined label values in `GetOrCreateTimeseries`. The argument list must match the parameter list exactly, and if it doesn't we generally find out at runtime or not at all. This model has more optimization potential, but is easier to misuse, than the alternative. The alternative approach is to always pass label:value pairs to `GetOrCreateTimeseries`, as opposed to an ordered list of values. +Argument ordering has been proposed as the way to pass pre-defined label values in `GetHandle`. The argument list must match the parameter list exactly, and if it doesn't we generally find out at runtime or not at all. This model has more optimization potential, but is easier to misuse than the alternative. The alternative approach is to always pass label:value pairs to `GetOrCreateTimeseries`, as opposed to an ordered list of values. The same discussion can be had for the `MeasurementBatch` type described here. It can be declared with an ordered list of metrics, then the `Record` API takes only an ordered list of numbers. Alternatively, and less prone to misuse, the `MeasurementBatch.Record` API could be declared with a list of metric:number pairs. From 8e9fcc38e38969931a500e99e2dbcea6d92930d7 Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 00:34:49 -0700 Subject: [PATCH 07/24] Add metrics handles RFC --- text/0003-measure-metric-type.md | 4 +- text/0007-metric-handles.md | 76 ++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+), 2 deletions(-) create mode 100644 text/0007-metric-handles.md diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index f0b257713..a85fb8317 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -63,7 +63,7 @@ Metric instruments are constructed by the API, they are not constructed by any s | Name | A string. | | Kind | One of Cumulative, Gauge, or Measure. | -| Keys | List of always-defined keys in handles for this metric. | +| Required Keys | List of always-defined keys in handles for this metric. | | Unit | The unit of measurement being recorded. | | Description | Information about this metric. | @@ -136,7 +136,7 @@ The batch measurement API uses a language-specific method name (e.g., `RecordBat ## Prior art and alternatives -Prometheus supports the notion of vector metrics, which are those that support pre-defined labels for a specific set of Keys. The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetHandle` in OpenTelemetry. As in this proposal, Prometheus supports a vector API for all metric types. +Prometheus supports the notion of vector metrics, which are those that support pre-defined labels for a specific set of required keys. The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetHandle` in OpenTelemetry. As in this proposal, Prometheus supports a vector API for all metric types. Statsd libraries generally report metric events individually. To implement statsd reporting from the OpenTelemetry, a `Meter` SDK would be installed that converts metric events into statsd updates. diff --git a/text/0007-metric-handles.md b/text/0007-metric-handles.md new file mode 100644 index 000000000..d45cf088c --- /dev/null +++ b/text/0007-metric-handles.md @@ -0,0 +1,76 @@ +# Metric Handle API specification + +**Status:** `proposed` + +Specify the behavior of the Metrics API `Handle` type, for efficient repeated-use of metric instruments. + +## Motivation + +The specification currently names this concept `TimeSeries`, the object returned by `GetOrCreateTimeseries`, which supports binding a metric to a pre-defined set of labels for repeated use. This proposal renames these `Handle` and `GetHandle`, respectively, and adds further detail to the API specification for handles. + +## Explanation + +The `TimeSeries` is renamed to `Handle` as the former name suggests an implementation, not an API concept. `Handle`, we feel, is more descriptive of the intended use. Likewise with `GetOrCreateTimeSeries` to `GetHandle` and `GetDefaultTimeSeries` to `GetDefaultHandle`, these names suggest an implementation and not the intended use. Applications are encouraged to re-use metric handles for efficiency. + +Handles are useful to reduce the cost of repeatedly recording a metric instrument (cumulative, gauge, or measure) with a pre-defined set of label values. All metric kinds support declaring a set of required label keys. These label keys, by definition, must be specified in every metric `Handle`. We permit "unspecified" label values in cases where a handle is requested but a value was not provided. The default metric handle has all its required keys unspecified. We presume that fast pre-aggregation of metrics data is only possible, in general, when the pre-aggregation keys are a subset of the required keys on the metric. + +`GetHandle` specifies two calling patterns that may be supported: one with ordered label values and one without. The facility for ordered label values is provided as a potential optimization that facilitates a simple lookup for the SDK; in this form, the API is permitted to thrown an exception or return an error when there is a mismatch in the arguments to `GetHandle`. When label values are accepted in any order, some SDKs may perform an expensive lookup to find an existing metrics handle, but they must not throw exceptions. + +`GetHandle` and `GetDefaultHandle` support additional label values not required in the definition of the metric instrument. These optional labels act the same as pre-defined labels in the low-level metrics data representation, only that they are not required. Some SDKs may elect to use additional label values as the "attachment" data on metrics. + +## Internal details + +The names (`Handle`, `GetHandle`, ...) are just language-neutral recommendations. Because each of the metric kinds supports a different operation (`Add()`, `Set()`, and `Record()`), there are logically distinct kinds of handle. Language APIs should feel free to choose type and method names with attention to the language's style. + +An implementation of `GetHandle` may elect to return a unique object to multiple callers for its own purposes, but implementations are not required to do so. When unique objects are a guarantee, implementation should consider additional label values in the uniqueness of the handle, to maintain the low-level metric event respresentation discussed in RFC [0003-measure-metric-type](./0003-measure-metric-tuype.md). + +The `Observer` API for observing gauge metrics on demand via a callback does not support handles. + +## Trade-offs and mitigations + +The addition of additional label values, for handles, is not essential for pre-aggregation purposes, so it may be seen as non-essential in that regard. However, API support for pre-defined labels also benefits program readability because it allows metric handles to be defined once in the source, rather than once per call site. + +This benefit could be extended even further, as a potential future improvement. Instead of creating one handle per instance of a metric with pre-defined values, it may be even more efficient to support pre-defining a set of label values for use constructing multiple metric handles. Consider the code for declaring three metrics: + +``` + var gauge = metric.NewFloat64Gauge("example.com/gauge", metric.WithKeys("a", "b", "c")) + var counter = metric.NewFloat64Cumulative("example.com/counter", metric.WithKeys("a", "b", "c")) + var measure = metric.NewFloat64Measure("example.com/measure", metric.WithKeys("a", "b", "c")) +``` + +and three handles: + +``` + gaugeHandle := gauge.GetHandleOrdered(1, 2, 3) // values for a, b, c + counterHandle := counter.GetHandleOrdered(1, 2, 3) // values for a, b, c + measureHandle := measure.GetHandleOrdered(1, 2, 3) // values for a, b, c +``` + +This can be potential improved as by making the label set a first-class concept. This has the potential to further reduce the cost of getting a group of handles with the same set of labels: + +``` + var commonKeys = metric.DefineKeys("a", "b", "c") + var gauge = metric.NewFloat64Gauge("example.com/gauge", metric.WithKeys(commonKeys)) + var counter = metric.NewFloat64Cumulative("example.com/counter", metric.WithKeys(commonKeys)) + var measure = metric.NewFloat64Measure("example.com/measure", metric.WithKeys(commonKeys)) + + labelSet := commonKeys.Values(1, 2, 3) // values for a, b, c + gaugeHandle := gauge.GetHandleOrdered(labelSet) + counterHandle := counter.GetHandleOrdered(labelSet) + measureHandle := measure.GetHandleOrdered(labelSet) +``` + +## Open questions + +Should the additional scope concept shown above be implemented? + + + + + + + + + + + From 5ae1df5ad64bfb6acea4fbaef09a4cb2640c384e Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 00:35:34 -0700 Subject: [PATCH 08/24] Rename 0000 --- ...{0007-metric-handles.md => 0000-metric-handles.md} | 11 ----------- 1 file changed, 11 deletions(-) rename text/{0007-metric-handles.md => 0000-metric-handles.md} (99%) diff --git a/text/0007-metric-handles.md b/text/0000-metric-handles.md similarity index 99% rename from text/0007-metric-handles.md rename to text/0000-metric-handles.md index d45cf088c..5e57f79ba 100644 --- a/text/0007-metric-handles.md +++ b/text/0000-metric-handles.md @@ -63,14 +63,3 @@ This can be potential improved as by making the label set a first-class concept. ## Open questions Should the additional scope concept shown above be implemented? - - - - - - - - - - - From 3bebb9233b9fa455d8e01d71e8f74d68d66d86f2 Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 00:38:28 -0700 Subject: [PATCH 09/24] Remove 0004 --- text/0004-metric-configurable-aggregation.md | 75 -------------------- 1 file changed, 75 deletions(-) delete mode 100644 text/0004-metric-configurable-aggregation.md diff --git a/text/0004-metric-configurable-aggregation.md b/text/0004-metric-configurable-aggregation.md deleted file mode 100644 index 9c9dc8e91..000000000 --- a/text/0004-metric-configurable-aggregation.md +++ /dev/null @@ -1,75 +0,0 @@ -# Let Metrics support configurable, recommended aggregations - -**Status:** `proposed` - -Let the user configure recommended Metric aggregations (SUM, COUNT, MIN, MAX, LAST_VALUE, HISTOGRAM, SUMMARY). - -## Motivation - -In the current API proposal, Metric types like Gauge and Cumulative are mapped into specific aggregations: Gauge:LAST_VALUE and Cumulative:SUM. Depending on RFC 0003-measure-metric-type, which creates a new MeasureMetric type, this proposal introduces the ability to configure alternative, potentially multiple aggregations for Metrics. This allows the MeasureMetric type to support HISTOGRAM and SUMMARY aggregations, as an alternative to raw statistics. - -## Explanation - -This proposal completes the elimination of Raw statistics by recognizing that aggregations should be independent of metric type. This recognizes that _sometimes_ we have a cumulative but want to compute a histogram of increment values, and _sometimes_ we have a measure that has multiple interesting aggregations. - -Following this change, we should think of the _Metric type_ as: - -1. Indicating something about what kind of numbers are being recorded (i.e., the input domain, e.g., restricted to values >= 0?) - 1. For Gauges: Something pre-computed where rate or count is not relevant - 1. For Cumulatives: Something where rate or count is relevant - 1. For Measures: Something where individual values are relevant -1. Indicating something about the default interpretation, based on the action verb (Set, Inc, Record, etc.) - 1. For Gauges: the action is Set() - 1. For Cumulatives: the action is Inc() - 1. For Measures: the action is Record() -1. Unless the programmer declares otherwise, suggesting a default aggregation - 1. For Gauges: LAST_VALUE is interesting, SUM is likely not interesting - 1. For Cumulatives: SUM is interesting, LAST_VALUE is likely not interesting - 1. For Measures: all aggregations apply, default is MIN, MAX, SUM, COUNT. - -## Internal details - -Metric constructors should take an optional list of aggregations, to override the default behavior. When constructed with an explicit list of aggregations, the implementation may use this as a hint about which aggregations should be exported by default. However, the implementation is not bound by these recommendations in any way and is free to control which aggregations that are applied. - -The standard defined aggregations are broken into two groups, those which are "decomposable" (i.e., inexpensive) and those which are not. - -The decomposable aggregations are simple to define: - -1. SUM: The sum of observed values. -1. COUNT: The number of observations. -1. MIN: The smallest value. -1. MAX: The largest value. -1. LAST_VALUE: The latest value. - -The non-decomposable aggregations do not have standard definitions, they are purely advisory. The intention behind these are: - -1. HISTOGRAM: The intended output is a distribution summary, specifically summarizing counts into non-overlapping ranges. -1. SUMMARY: This is a more generic way to request information about a distribution, perhaps represented in some vendor-specific way / not a histogram. - -## Example - -To declare a MeasureMetric, - -``` - myMetric := metric.NewMeasureMetric( - "ex.com/mymetric", - metric.WithAggregations(metric.SUM, metric.COUNT), - metric.WithLabelKeys(aKey, bKey)) -) -``` - -Here, we have declared a Measure-type metric with recommended SUM and COUNT aggregations (allowing to compute the average) with `aKey` and `bKey` as recommended aggregation dimensions. While the SDK has full control over which aggregations are actually performed, the programmer has specified a good default behavior for the implementation to use. - -## Trade-offs and mitigations - -This avoids requiring programmers to use the `view` API, which is an SDK API, not a user-facing instrumentation API. Letting the application programmer recommend aggregations directly gives the implementation more information about the raw statistics. Letting programmers declare their intent has few downsides, since there is a well-defined default behavior. - -## Prior art and alternatives - -Existing systems generally declare separate Metric types according to the desired aggregation. Raw statistics were invented to overcome this, and the present proposal brings back the ability to specify an Aggregation at the point where a Metric is defined. - -## Open questions - -There are questions about the value of the MIN and MAX aggregations. While they are simple to compute, they are difficult to use in practice. - -There are questions about the interpretation of HISTOGRAM and SUMMARY. The point of Raw statistics was that we shouldn't specify these aggregations because they are expensive and many implementations are possible. This is still true. What is the value in specifying HISTOGRAM as opposed to SUMMARY? How is SUMMARY different from MIN/MAX/COUNT/SUM, does it imply implementation-defined quantiles? From 7b08d60a4ecfcba5493f67979b9f6741c70ea277 Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 11:19:23 -0700 Subject: [PATCH 10/24] Add an open question from python PR87 --- text/0003-measure-metric-type.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index a85fb8317..260e15037 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -16,7 +16,7 @@ A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs # Overview -Introduce a `Measure` kind of metric object that supports a `Record` API method. Like existing `Gauge` and `Cumulative` metrics, the new `Measure` metric supports pre-defined labels. A new measurement batch API is introduced for recording multiple metric observations simultaneously. +Introduce a `Measure` kind of metric object that supports a `Record` API method. Like existing `Gauge` and `Cumulative` metrics, the new `Measure` metric supports pre-defined labels. A new `RecordBatch` measurement API is introduced for recording multiple metric observations simultaneously. ## Terminology @@ -142,6 +142,13 @@ Statsd libraries generally report metric events individually. To implement stat ## Open questions +### `GetHandle` argument ordering Argument ordering has been proposed as the way to pass pre-defined label values in `GetHandle`. The argument list must match the parameter list exactly, and if it doesn't we generally find out at runtime or not at all. This model has more optimization potential, but is easier to misuse than the alternative. The alternative approach is to always pass label:value pairs to `GetOrCreateTimeseries`, as opposed to an ordered list of values. -The same discussion can be had for the `MeasurementBatch` type described here. It can be declared with an ordered list of metrics, then the `Record` API takes only an ordered list of numbers. Alternatively, and less prone to misuse, the `MeasurementBatch.Record` API could be declared with a list of metric:number pairs. +### `RecordBatch` argument ordering + +The discussion above can be had for the proposed `RecordBatch` method. It can be declared with an ordered list of metrics, then the `Record` API takes only an ordered list of numbers. Alternatively, and less prone to misuse, the `MeasurementBatch.Record` API could be declared with a list of metric:number pairs. + +### Eliminate `GetDefaultHandle()` + +Instead of a mechanism to obtain a default handle, some languages may prefer to simply operate on the metric instrument directly in this case. Should OpenTelemetry eliminate `GetDefaultHandle` and instead specify that cumulative, gauge, and measure metric instruments implement `Add()`, `Set()`, and `Record()` with the same interpretation? \ No newline at end of file From b93d931c49e6a15e427a5b8d40eaeb8c7e4c65fa Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 12:22:27 -0700 Subject: [PATCH 11/24] Add an open question about RecordBatch --- text/0003-measure-metric-type.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 260e15037..9ad361ad5 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -151,4 +151,16 @@ The discussion above can be had for the proposed `RecordBatch` method. It can b ### Eliminate `GetDefaultHandle()` -Instead of a mechanism to obtain a default handle, some languages may prefer to simply operate on the metric instrument directly in this case. Should OpenTelemetry eliminate `GetDefaultHandle` and instead specify that cumulative, gauge, and measure metric instruments implement `Add()`, `Set()`, and `Record()` with the same interpretation? \ No newline at end of file +Instead of a mechanism to obtain a default handle, some languages may prefer to simply operate on the metric instrument directly in this case. Should OpenTelemetry eliminate `GetDefaultHandle` and instead specify that cumulative, gauge, and measure metric instruments implement `Add()`, `Set()`, and `Record()` with the same interpretation? + +### `RecordBatch` support for all metrics + +In the 8/21 working session, we agreed to limit `RecordBatch` to recording of simultaneous measure metrics, meaning to exclude cumulatives and gauges from batch recording. There are arguments in favor of supporting batch recording for all metric instruments. + +- If atomicity (i.e., the all-or-none property) is the reason for batch reporting, it makes sense to include all the metric instruments in the API +- `RecordBatch` support for cumulatives and gauges will be natural for SDKs that act as forwarders for metric events . The natural implementation for `Add()` and `Set()` methods will be `RecordBatch` with a single event. +- Likewise, it is simple for an SDK that acts as an aggregator (vs. forwarder) to redirect `Add()` and `Set()` APIs to the handle-specific `Add()` and `Set()` methods; while the SDK, as the implementation, can ensure these cumulative and gauge updates are atomic with the measure updates. + +Arguments against batch recording for all metric instruments: + +- The `Record` in `RecordBatch` suggests it is to be applied to measure metrics. This is due to measure metrics being the most general-purpose of metric instruments. \ No newline at end of file From a8b7baf3d8a0514a70b770543e010e4dff6260d7 Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 12:52:48 -0700 Subject: [PATCH 12/24] Clarify the open questions --- text/0003-measure-metric-type.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 9ad361ad5..53e2befbb 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -153,6 +153,10 @@ The discussion above can be had for the proposed `RecordBatch` method. It can b Instead of a mechanism to obtain a default handle, some languages may prefer to simply operate on the metric instrument directly in this case. Should OpenTelemetry eliminate `GetDefaultHandle` and instead specify that cumulative, gauge, and measure metric instruments implement `Add()`, `Set()`, and `Record()` with the same interpretation? +The argument against this is that metric instruments are meant to be pure API objects, they are not constructed through an SDK. Therefore, the default Meter (SDK) will have to be located from the context, meaning there is a question about whether this is as efficient as storing a re-usable handle for the default case. For metric instruments with no required keys, this will be a real question: what is the benefit of a handle of it specifies no information other than the SDK? + +If we eliminate `GetDefaultHandle()`, the SDK may keep a map of metric instrument to default handle on its own. + ### `RecordBatch` support for all metrics In the 8/21 working session, we agreed to limit `RecordBatch` to recording of simultaneous measure metrics, meaning to exclude cumulatives and gauges from batch recording. There are arguments in favor of supporting batch recording for all metric instruments. @@ -163,4 +167,4 @@ In the 8/21 working session, we agreed to limit `RecordBatch` to recording of si Arguments against batch recording for all metric instruments: -- The `Record` in `RecordBatch` suggests it is to be applied to measure metrics. This is due to measure metrics being the most general-purpose of metric instruments. \ No newline at end of file +- The `Record` in `RecordBatch` suggests it is to be applied to measure metrics. This is due to measure metrics being the most general-purpose of metric instruments. From f5e5c39a1119a5d6ac0923723368c33e78e9a4a7 Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 3 Sep 2019 22:43:56 -0700 Subject: [PATCH 13/24] Name NonNegative and NonDescending options --- text/0003-measure-metric-type.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 53e2befbb..6cdff78e2 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -91,7 +91,7 @@ The specification will be updated with the following guidance. Likely to be the most common kind of metric, cumulative metric events express the computation of a sum. Choose this kind of metric when the value is a quantity, the sum is of primary interest, and the event count and distribution are not of primary interest. To raise (or lower) a cumulative metric, call the `Add()` method. -If the quantity in question is always non-negative, it implies that the sum is strictly ascending. When this is the case, the cumulative metric also serves to define a rate. For this reason, cumulative metrics have an option to be declared as non-negative. The API will reject negative updates to non-negative cumulative metrics, instead submitting an SDK error event, which helps ensure meaningful rate calculations. +If the quantity in question is always non-negative, it implies that the sum is strictly ascending. When this is the case, the cumulative metric also serves to define a rate. For this reason, cumulative metrics have a `NonNegative` option to be declared as non-negative. The API will reject negative updates to non-negative cumulative metrics, instead submitting an SDK error event, which helps ensure meaningful rate calculations. For cumulative metrics, the default OpenTelemetry implementation exports the sum of event values taken over an interval of time. @@ -101,7 +101,7 @@ Gauge metrics express a pre-calculated value that is either `Set()` by explicit Only the gauge kind of metric supports observing the metric via a gauge `Observer` callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. -As a special case, to support existing metrics infrastructure, a gauge metric may be declared as a precomputed sum, in which case it is defined as strictly ascending. The API will reject descending updates to strictly-ascending gauges, instead submitting an SDK error event. +As a special case, to support existing metrics infrastructure, a gauge metric may be declared as a precomputed cumulative sum using the `NonDescending` option, in which case it is defined as a strictly ascending. The API will reject descending updates to non-descending gauges, instead submitting an SDK error event. For gauge metrics, the default OpenTelemetry implementation exports the last value that was explicitly `Set()`, or if using a callback, the current value from the Observer. @@ -114,7 +114,7 @@ Measure metrics express a distribution of values. This kind of metric should be The key property of a measure metric event is that computing quantiles and/or summarizing a distribution (e.g., via a histogram) may be expensive. Not only will implementations have various capabilities and algorithms for this task, users may wish to control the quality and cost of aggregating measure metrics. -Like cumulative metrics, non-negative measures are an important case because they support rate calculations. As an option, measure metrics may be declared as non-negative. The API will reject negative metric events for non-negative measures, instead submitting an SDK error event. +Like cumulative metrics, non-negative measures are an important case because they support rate calculations. As an option, measure metrics may be declared as `NonNegative`. The API will reject negative metric events for non-negative measures, instead submitting an SDK error event. Because measure metrics have such wide application, implementations are likely to provide configurable behavior. OpenTelemetry may provide such a facility in its standard SDK, but in case no configuration is provided by the application, a low-cost policy is specified as the default behavior, whic is to export the sum, the count (rate), the minimum value, and the maximum value. @@ -163,7 +163,7 @@ In the 8/21 working session, we agreed to limit `RecordBatch` to recording of si - If atomicity (i.e., the all-or-none property) is the reason for batch reporting, it makes sense to include all the metric instruments in the API - `RecordBatch` support for cumulatives and gauges will be natural for SDKs that act as forwarders for metric events . The natural implementation for `Add()` and `Set()` methods will be `RecordBatch` with a single event. -- Likewise, it is simple for an SDK that acts as an aggregator (vs. forwarder) to redirect `Add()` and `Set()` APIs to the handle-specific `Add()` and `Set()` methods; while the SDK, as the implementation, can ensure these cumulative and gauge updates are atomic with the measure updates. +- Likewise, it is simple for an SDK that acts as an aggregator (not a forwarder) to redirect `Add()` and `Set()` APIs to the handle-specific `Add()` and `Set()` methods; while the SDK, as the implementation, still may (not must) treat these cumulative and gauge updates as atomic. Arguments against batch recording for all metric instruments: From d97868fa226a8046b6c384a150d4f351147a3a1f Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 10:51:55 -0700 Subject: [PATCH 14/24] Clarify the Measurement unit for RecordBatch --- text/0003-measure-metric-type.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 6cdff78e2..2c11a380f 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -124,13 +124,12 @@ All OpenTelemetry metrics may be disabled by default, as an option. Use this op ### RecordBatch API -Applications sometimes want to record multiple metrics in a single API call, either becase the values are inter-related or because it lowers overhead. We agree that recording batch measurements will be restricted to measure metrics, although this support could be extended to all kinds of metric in the future. +Applications sometimes want to act upon multiple metric handles in a single API call, either because the values are inter-related to each other, or because it lowers overhead. We agree that recording batch measurements will be restricted to measure metrics, although this support could be extended to all kinds of metric in the future. -Logically, a measurement is defined as: +A single measurement is defined as: -- Measure metric: which metric is being updated -- Value: a floating point or integer -- Pre-defined label values: associated via metrics API handle +- Handle: the measure instrument and pre-defined label values +- Value: the recorded floating point or integer data The batch measurement API uses a language-specific method name (e.g., `RecordBatch`). The entire batch of measurements takes place within some (implicit or explicit) context. From 54a778f4310f07e1b28bf856724819a8166f67e7 Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 14:46:22 -0700 Subject: [PATCH 15/24] Add issues addressed --- text/0003-measure-metric-type.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 2c11a380f..1066800e0 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -167,3 +167,14 @@ In the 8/21 working session, we agreed to limit `RecordBatch` to recording of si Arguments against batch recording for all metric instruments: - The `Record` in `RecordBatch` suggests it is to be applied to measure metrics. This is due to measure metrics being the most general-purpose of metric instruments. + +### Metric "attachments" support + +OpenCensus has the notion of a metric attachment, allowing the application to include additional information associated with the event, for sampling purposes. The position taken here is that additional label values in the metric handle (specified in 0000-metric-handles.md) or the context are a suitable replacement. + +## Issues addressed + +https://github.com/open-telemetry/opentelemetry-specification/issues/83 +https://github.com/open-telemetry/opentelemetry-specification/issues/144 +https://github.com/open-telemetry/opentelemetry-specification/issues/145 +https://github.com/open-telemetry/opentelemetry-specification/issues/146 From fd1c02777ad622d1e902735a4f6bec8581ac9009 Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 14:49:15 -0700 Subject: [PATCH 16/24] Linkify --- text/0003-measure-metric-type.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 1066800e0..e1884200e 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -174,7 +174,10 @@ OpenCensus has the notion of a metric attachment, allowing the application to in ## Issues addressed -https://github.com/open-telemetry/opentelemetry-specification/issues/83 -https://github.com/open-telemetry/opentelemetry-specification/issues/144 -https://github.com/open-telemetry/opentelemetry-specification/issues/145 -https://github.com/open-telemetry/opentelemetry-specification/issues/146 +(Raw vs. other metrics / measurements are unclear)[https://github.com/open-telemetry/opentelemetry-specification/issues/83] + +(`record` should take a generic `Attachment` class instead of having tracing dependency)[https://github.com/open-telemetry/opentelemetry-specification/issues/144] + +(Eliminate Measurement class to save on allocations)[https://github.com/open-telemetry/opentelemetry-specification/issues/145] + +(Implement three more types of Metric)[https://github.com/open-telemetry/opentelemetry-specification/issues/146] From deb6facfbb4426fba35417e0a58c623c522b71c9 Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 14:50:36 -0700 Subject: [PATCH 17/24] Linkify --- text/0003-measure-metric-type.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index e1884200e..0efcd7e4d 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -174,10 +174,10 @@ OpenCensus has the notion of a metric attachment, allowing the application to in ## Issues addressed -(Raw vs. other metrics / measurements are unclear)[https://github.com/open-telemetry/opentelemetry-specification/issues/83] +[Raw vs. other metrics / measurements are unclear])https://github.com/open-telemetry/opentelemetry-specification/issues/83) -(`record` should take a generic `Attachment` class instead of having tracing dependency)[https://github.com/open-telemetry/opentelemetry-specification/issues/144] +[`record` should take a generic `Attachment` class instead of having tracing dependency](https://github.com/open-telemetry/opentelemetry-specification/issues/144) -(Eliminate Measurement class to save on allocations)[https://github.com/open-telemetry/opentelemetry-specification/issues/145] +[Eliminate Measurement class to save on allocations](https://github.com/open-telemetry/opentelemetry-specification/issues/145) -(Implement three more types of Metric)[https://github.com/open-telemetry/opentelemetry-specification/issues/146] +[Implement three more types of Metric](https://github.com/open-telemetry/opentelemetry-specification/issues/146) From a1e526d08a14f1d45055e615e69f9a59fa99e371 Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 15:02:18 -0700 Subject: [PATCH 18/24] Format --- text/0000-metric-handles.md | 15 ++++++++++----- text/0003-measure-metric-type.md | 16 +++------------- 2 files changed, 13 insertions(+), 18 deletions(-) diff --git a/text/0000-metric-handles.md b/text/0000-metric-handles.md index 5e57f79ba..684112cb2 100644 --- a/text/0000-metric-handles.md +++ b/text/0000-metric-handles.md @@ -46,7 +46,7 @@ and three handles: measureHandle := measure.GetHandleOrdered(1, 2, 3) // values for a, b, c ``` -This can be potential improved as by making the label set a first-class concept. This has the potential to further reduce the cost of getting a group of handles with the same set of labels: +This can be potentially improved as by making the label map a first-class concept. This has the potential to further reduce the cost of getting a group of handles with the same map of labels: ``` var commonKeys = metric.DefineKeys("a", "b", "c") @@ -54,12 +54,17 @@ This can be potential improved as by making the label set a first-class concept. var counter = metric.NewFloat64Cumulative("example.com/counter", metric.WithKeys(commonKeys)) var measure = metric.NewFloat64Measure("example.com/measure", metric.WithKeys(commonKeys)) - labelSet := commonKeys.Values(1, 2, 3) // values for a, b, c - gaugeHandle := gauge.GetHandleOrdered(labelSet) - counterHandle := counter.GetHandleOrdered(labelSet) - measureHandle := measure.GetHandleOrdered(labelSet) + labelMap := commonKeys.Values(1, 2, 3) // values for a, b, c + gaugeHandle := gauge.GetHandle(labelMap) + counterHandle := counter.GetHandle(labelMap) + measureHandle := measure.GetHandle(labelMap) ``` ## Open questions Should the additional scope concept shown above be implemented? + +### Metric `Attachment` support + +OpenCensus has the notion of a metric attachment, allowing the application to include additional information associated with the event, for sampling purposes. The position taken here is that additional label values on the metric handle (specified here) or the context are a suitable replacement. + diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 0efcd7e4d..4ca921bb8 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -4,14 +4,6 @@ # Foreward -This propsal was originally split into three semi-related parts. Based on the feedback, they are now combined here into a single proposal. The original proposals were: - - 000x-metric-pre-defined-labels - 000x-metric-measure - 000x-eliminate-stats-record - -### Updated 8/27/2019 - A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs (0003 and 0004) and several surrounding concerns. This document has been revised with related updates that were agreed upon during this working session. See the [meeting notes](https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#). # Overview @@ -61,6 +53,8 @@ There are three kinds of metric instrument, `CumulativeMetric`, `GaugeMetric`, a Metric instruments are constructed by the API, they are not constructed by any specific SDK. +| Field | Description | +|------|-----------| | Name | A string. | | Kind | One of Cumulative, Gauge, or Measure. | | Required Keys | List of always-defined keys in handles for this metric. | @@ -168,13 +162,9 @@ Arguments against batch recording for all metric instruments: - The `Record` in `RecordBatch` suggests it is to be applied to measure metrics. This is due to measure metrics being the most general-purpose of metric instruments. -### Metric "attachments" support - -OpenCensus has the notion of a metric attachment, allowing the application to include additional information associated with the event, for sampling purposes. The position taken here is that additional label values in the metric handle (specified in 0000-metric-handles.md) or the context are a suitable replacement. - ## Issues addressed -[Raw vs. other metrics / measurements are unclear])https://github.com/open-telemetry/opentelemetry-specification/issues/83) +[Raw vs. other metrics / measurements are unclear](https://github.com/open-telemetry/opentelemetry-specification/issues/83) [`record` should take a generic `Attachment` class instead of having tracing dependency](https://github.com/open-telemetry/opentelemetry-specification/issues/144) From 9aee892fce3360bbec920d7565727ae20d2650a1 Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 16:11:18 -0700 Subject: [PATCH 19/24] Address option names and default settings --- text/0000-metric-handles.md | 6 ++++++ text/0003-measure-metric-type.md | 23 ++++++++++++++++------- 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/text/0000-metric-handles.md b/text/0000-metric-handles.md index 684112cb2..d64ee5b0e 100644 --- a/text/0000-metric-handles.md +++ b/text/0000-metric-handles.md @@ -68,3 +68,9 @@ Should the additional scope concept shown above be implemented? OpenCensus has the notion of a metric attachment, allowing the application to include additional information associated with the event, for sampling purposes. The position taken here is that additional label values on the metric handle (specified here) or the context are a suitable replacement. +## Issues addressed + +[Agreements reached on handles and naming in the working group convened on 8/21/2019](https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#). + +[`record` should take a generic `Attachment` class instead of having tracing dependency](https://github.com/open-telemetry/opentelemetry-specification/issues/144) + diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 4ca921bb8..2a18f690e 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -63,7 +63,7 @@ Metric instruments are constructed by the API, they are not constructed by any s See the specification for more information on these fields, including formatting and uniqueness requirements. To define a new metric, use one of the language-specific API methods (e.g., with names like `NewCumulativeMetric`, `NewGaugeMetric`, or `NewMeasureMetric`). -Metric instrument Handles are SDK-provided objects that combine a metric instrument with a set of pre-defined labels. Handles are obtained by calling a language-specific API method (e.g., `GetHandle`) on the metric instrument with its label values. Handles may be used to `Set()`, `Add()`, or `Record()` metrics according to their kind. The `Set()`, `Add()`, and `Record()` +Metric instrument Handles are SDK-provided objects that combine a metric instrument with a set of pre-defined labels. Handles are obtained by calling a language-specific API method (e.g., `GetHandle`) on the metric instrument with certain label values. Handles may be used to `Set()`, `Add()`, or `Record()` metrics according to their kind. ## Selecting Metric Kind @@ -85,7 +85,7 @@ The specification will be updated with the following guidance. Likely to be the most common kind of metric, cumulative metric events express the computation of a sum. Choose this kind of metric when the value is a quantity, the sum is of primary interest, and the event count and distribution are not of primary interest. To raise (or lower) a cumulative metric, call the `Add()` method. -If the quantity in question is always non-negative, it implies that the sum is strictly ascending. When this is the case, the cumulative metric also serves to define a rate. For this reason, cumulative metrics have a `NonNegative` option to be declared as non-negative. The API will reject negative updates to non-negative cumulative metrics, instead submitting an SDK error event, which helps ensure meaningful rate calculations. +If the quantity in question is always non-negative, it implies that the sum never descends. This is the common case, where cumulative metrics only go up, and these _unidirectional_ cumulative metric instruments serve to compute a rate. For this reason, cumulative metrics have a `Bidirectional` option to be declared as allowing negative inputs, the uncommon case. The API will reject negative inputs to (default) unidirectional cumulative metrics, instead submitting an SDK error event, which helps ensure meaningful rate calculations. For cumulative metrics, the default OpenTelemetry implementation exports the sum of event values taken over an interval of time. @@ -93,11 +93,11 @@ For cumulative metrics, the default OpenTelemetry implementation exports the sum Gauge metrics express a pre-calculated value that is either `Set()` by explicit instrumentation or observed through a callback. Generally, this kind of metric should be used when the metric cannot be expressed as a sum or a rate because the measurement interval is arbitrary. Use this kind of metric when the measurement is not a quantity, and the sum and event count are not of interest. -Only the gauge kind of metric supports observing the metric via a gauge `Observer` callback (as an option). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. +Only the gauge kind of metric supports observing the metric via a gauge `Observer` callback (as an option, see `0000-metric-observer.md`). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. -As a special case, to support existing metrics infrastructure, a gauge metric may be declared as a precomputed cumulative sum using the `NonDescending` option, in which case it is defined as a strictly ascending. The API will reject descending updates to non-descending gauges, instead submitting an SDK error event. +As a special case, to support existing metrics infrastructure and the `Observer` pattern, a gauge metric may be declared as a precomputed, unidirectional sum using the `Unidirectional` option, in which case it is may be used to define a rate. The initial value is presumed to be zero. The API will reject descending updates to non-descending gauges, instead submitting an SDK error event. -For gauge metrics, the default OpenTelemetry implementation exports the last value that was explicitly `Set()`, or if using a callback, the current value from the Observer. +For gauge metrics, the default OpenTelemetry implementation exports the last value that was explicitly `Set()`, or if using a callback, the current value from the `Observer`. ### Measure metric @@ -116,6 +116,17 @@ Because measure metrics have such wide application, implementations are likely t All OpenTelemetry metrics may be disabled by default, as an option. Use this option to indicate that the default implementation should be to do nothing for events about this metric. +### Option summary + +The optional properties of a metric instrument are: + +| Property | Description | Metric kind | +|----------|-------------|-------------| +| Required Keys | Determines labels that are always set on metric handles | All kinds | +| Bidirectional | Indicates a cumulative metric instrument that goes up and down | Cumulative | +| Unidirectional | Indicate a gauge that only ascends, for rate calculation | Gauge | +| NonNegative | Indicates a measure that is never negative, for rate calculation | Measure | + ### RecordBatch API Applications sometimes want to act upon multiple metric handles in a single API call, either because the values are inter-related to each other, or because it lowers overhead. We agree that recording batch measurements will be restricted to measure metrics, although this support could be extended to all kinds of metric in the future. @@ -166,8 +177,6 @@ Arguments against batch recording for all metric instruments: [Raw vs. other metrics / measurements are unclear](https://github.com/open-telemetry/opentelemetry-specification/issues/83) -[`record` should take a generic `Attachment` class instead of having tracing dependency](https://github.com/open-telemetry/opentelemetry-specification/issues/144) - [Eliminate Measurement class to save on allocations](https://github.com/open-telemetry/opentelemetry-specification/issues/145) [Implement three more types of Metric](https://github.com/open-telemetry/opentelemetry-specification/issues/146) From 7ae4bd8e2798c6a69fa2f61a04bc05be63c388f3 Mon Sep 17 00:00:00 2001 From: jmacd Date: Wed, 4 Sep 2019 16:25:33 -0700 Subject: [PATCH 20/24] Answer questions --- text/0003-measure-metric-type.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 2a18f690e..b0091456e 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -112,9 +112,9 @@ Like cumulative metrics, non-negative measures are an important case because the Because measure metrics have such wide application, implementations are likely to provide configurable behavior. OpenTelemetry may provide such a facility in its standard SDK, but in case no configuration is provided by the application, a low-cost policy is specified as the default behavior, whic is to export the sum, the count (rate), the minimum value, and the maximum value. -### Disable selected metrics by default +### Option to disable metrics by default -All OpenTelemetry metrics may be disabled by default, as an option. Use this option to indicate that the default implementation should be to do nothing for events about this metric. +Metric instruments are enabled by default, meaning that SDKs will export metric data for this instrument without configuration. Metric instruments support a `Disabled` option, marking them as verbose sources of information that may be configured on an as-needed basis to control cost (e.g., using a "views" API). ### Option summary @@ -123,6 +123,7 @@ The optional properties of a metric instrument are: | Property | Description | Metric kind | |----------|-------------|-------------| | Required Keys | Determines labels that are always set on metric handles | All kinds | +| Disabled | Indicates a verbose metric that does not report by default | All kinds | | Bidirectional | Indicates a cumulative metric instrument that goes up and down | Cumulative | | Unidirectional | Indicate a gauge that only ascends, for rate calculation | Gauge | | NonNegative | Indicates a measure that is never negative, for rate calculation | Measure | From 18587f6d4a86b88ed93fb3f385c6c6c068dad0dd Mon Sep 17 00:00:00 2001 From: jmacd Date: Thu, 5 Sep 2019 07:56:01 -0700 Subject: [PATCH 21/24] Spelling --- text/0003-measure-metric-type.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index b0091456e..7e47e577a 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -2,7 +2,7 @@ **Status:** `proposed` -# Foreward +# Foreword A working group convened on 8/21/2019 to discuss and debate the two metrics RFCs (0003 and 0004) and several surrounding concerns. This document has been revised with related updates that were agreed upon during this working session. See the [meeting notes](https://docs.google.com/document/d/1d0afxe3J6bQT-I6UbRXeIYNcTIyBQv4axfjKF4yvAPA/edit#). @@ -110,7 +110,7 @@ The key property of a measure metric event is that computing quantiles and/or su Like cumulative metrics, non-negative measures are an important case because they support rate calculations. As an option, measure metrics may be declared as `NonNegative`. The API will reject negative metric events for non-negative measures, instead submitting an SDK error event. -Because measure metrics have such wide application, implementations are likely to provide configurable behavior. OpenTelemetry may provide such a facility in its standard SDK, but in case no configuration is provided by the application, a low-cost policy is specified as the default behavior, whic is to export the sum, the count (rate), the minimum value, and the maximum value. +Because measure metrics have such wide application, implementations are likely to provide configurable behavior. OpenTelemetry may provide such a facility in its standard SDK, but in case no configuration is provided by the application, a low-cost policy is specified as the default behavior, which is to export the sum, the count (rate), the minimum value, and the maximum value. ### Option to disable metrics by default From c3225a258ca3d6c487cf71ce81f49c1fa016f0eb Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 10 Sep 2019 22:04:21 -0700 Subject: [PATCH 22/24] Use 0007 --- text/{0000-metric-handles.md => 0007-metric-handles.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-metric-handles.md => 0007-metric-handles.md} (100%) diff --git a/text/0000-metric-handles.md b/text/0007-metric-handles.md similarity index 100% rename from text/0000-metric-handles.md rename to text/0007-metric-handles.md From f1e5972a08efb3f2beb04f430d2f6b9bbbd2b147 Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 10 Sep 2019 22:04:47 -0700 Subject: [PATCH 23/24] Refer to 0008 --- text/0003-measure-metric-type.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0003-measure-metric-type.md b/text/0003-measure-metric-type.md index 7e47e577a..6a17671f6 100644 --- a/text/0003-measure-metric-type.md +++ b/text/0003-measure-metric-type.md @@ -93,7 +93,7 @@ For cumulative metrics, the default OpenTelemetry implementation exports the sum Gauge metrics express a pre-calculated value that is either `Set()` by explicit instrumentation or observed through a callback. Generally, this kind of metric should be used when the metric cannot be expressed as a sum or a rate because the measurement interval is arbitrary. Use this kind of metric when the measurement is not a quantity, and the sum and event count are not of interest. -Only the gauge kind of metric supports observing the metric via a gauge `Observer` callback (as an option, see `0000-metric-observer.md`). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. +Only the gauge kind of metric supports observing the metric via a gauge `Observer` callback (as an option, see `0008-metric-observer.md`). Semantically, there is an important difference between explicitly setting a gauge and observing it through a callback. In case of setting the gauge explicitly, the call happens inside of an implicit or explicit context. The implementation is free to associate the explicit `Set()` event with a context, for example. When observing gauge metrics via a callback, there is no context associated with the event. As a special case, to support existing metrics infrastructure and the `Observer` pattern, a gauge metric may be declared as a precomputed, unidirectional sum using the `Unidirectional` option, in which case it is may be used to define a rate. The initial value is presumed to be zero. The API will reject descending updates to non-descending gauges, instead submitting an SDK error event. From ec5388531273c8ed87e7d129f7745e1da89d98eb Mon Sep 17 00:00:00 2001 From: jmacd Date: Tue, 10 Sep 2019 22:12:35 -0700 Subject: [PATCH 24/24] Take suggestion --- text/0007-metric-handles.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0007-metric-handles.md b/text/0007-metric-handles.md index d64ee5b0e..a584b7cd6 100644 --- a/text/0007-metric-handles.md +++ b/text/0007-metric-handles.md @@ -10,7 +10,9 @@ The specification currently names this concept `TimeSeries`, the object returned ## Explanation -The `TimeSeries` is renamed to `Handle` as the former name suggests an implementation, not an API concept. `Handle`, we feel, is more descriptive of the intended use. Likewise with `GetOrCreateTimeSeries` to `GetHandle` and `GetDefaultTimeSeries` to `GetDefaultHandle`, these names suggest an implementation and not the intended use. Applications are encouraged to re-use metric handles for efficiency. +The `TimeSeries` is renamed to `Handle` as the former name suggests an implementation, not an API concept. `Handle`, we feel, is more descriptive of the intended use. Likewise with `GetOrCreateTimeSeries` to `GetHandle` and `GetDefaultTimeSeries` to `GetDefaultHandle`, these names suggest an implementation and not the intended use. + +Applications are encouraged to re-use metric handles for efficiency. Handles are useful to reduce the cost of repeatedly recording a metric instrument (cumulative, gauge, or measure) with a pre-defined set of label values. All metric kinds support declaring a set of required label keys. These label keys, by definition, must be specified in every metric `Handle`. We permit "unspecified" label values in cases where a handle is requested but a value was not provided. The default metric handle has all its required keys unspecified. We presume that fast pre-aggregation of metrics data is only possible, in general, when the pre-aggregation keys are a subset of the required keys on the metric.