open-telemetry · bogdandrutu · Aug 13, 2019 · Jun 27, 2019 · Jun 27, 2019 · Jun 27, 2019
diff --git a/0001-metric-pre-defined-labels.md b/0001-metric-pre-defined-labels.md
@@ -0,0 +1,37 @@
+# Pre-defined label support for all metric operations
+
+Let all Metric objects (Cumulative, Gauge, ...) and Raw statistics support pre-defined label values.
+
+## Motivation
+
+In the current `Metric.GetOrCreateTimeSeries` API for Gauges and Cumulatives, the caller obtains a `TimeSeries` handle for repeatedly recording metrics with certain pre-defined label values set.  This is an important optimization, especially for exporting aggregated metrics.
+
+The use of pre-defined labels improves usability too, for working with metrics in code. Application programs with long-lived objects and associated Metrics can compute predefined label values once (e.g., in a constructor), rather than once per call site.
+
+The current API for recording Raw statistics does not support the same optimization or usability advantage.  This RFC proposes to add support for pre-defined labels on all metrics.
+
+## Explanation
+
+In the current proposal, Metrics are used for pre-aggregated metric types, whereas Raw statistics are used for uncommon and vendor-specific aggregations.  The optimization and the usability advantages gained with pre-defined labels should be extended to Raw statistics because they are equally important and equally applicable. This is a new requirement.
+
+For example, where the application wants to compute a histogram of some value (e.g., latency), there's good reason to pre-aggregate such information.  In this example, it allows an implementation to effienctly export the histogram of latencies "grouped" into individual results by label value(s).
+
+## Internal details
+
+This RFC is accompanied by RFC 0002-metric-measure which proposes to create a new Metric type to replace Raw statistics.  The metric type, named "Measure", would replace the existing concept and type named "Measure" in the metrics API.  The new MeasureMetric object would support a `Record` method to record measurements.
+
+## Trade-offs and mitigations
+
+This is a refactoring of the existing proposal to cover more use-cases and arguably reduces API complexity.
+
+## Prior art and alternatives
+
+Prometheus supports the notion of vector metrics, which are those with declared dimensions.  The vector-metric API supports a variety of methods like `WithLabelValues` to associate labels with a metric handle, similar to `GetOrCreateTimeSeries` in the existing proposal.  As in this proposal, Prometheus supports vector metrics for all metric types.
+
+## Open questions
+
+This RFC is co-dependent on several others; it's an open question how to address this concern if the other RFCs are not accepted.
+
+## Future possibilities
+
+This change will potentially help clarify the relationship between Metric types and Aggregation types.  In a future RFC, we will propose that MeasureMetrics can be used to support arbitrary "advanced" aggregations including histograms and distribution summaries.
diff --git a/0002-metric-measure.md b/0002-metric-measure.md
@@ -0,0 +1,41 @@
+# Replace Raw statistics with Measure-type Metric
+
+Define a new Metric type named "Measure" to cover existing "Raw" statistics uses.
+
+## Motivation
+
+The primary motivation is that Raw statistics should support the optimization and usability improvements associated with pre-defined label values (0001-metric-pre-defined-labels).  By elevating non-Cumulative, non-Gauge statistics to the same conceptual level as Metrics in the API, we effectively make the type of a metric independent from whether it supports pre-defined labels.
+
+This also makes it possible to eliminate the low-level `stats.Record` interface from the API specification entirely (0003-eliminate-stats-record).
+
+## Explanation
+
+This proposal suggests we think about which aggregations apply to a metric independently from its type.  A MeasureMetric could be used to aggregate a Histogram, or a Summary, or _both_ of these aggregations simultaneously.  This proposal makes metric type independent of aggregation type, whereas there is a precedent for combining these types into one.
+
+The proposal here suggests that we think of the metric type in terms of the _action performed_ (i.e., which _verb_ applies?).  Gauges support the `Set` action. Cumulatives support an `Inc` action. Measures support a `Record` action.
+
+This extends the `GetOrCreateTimeSeries` (pre-defined labels) functionality supported by Metrics to what has been known as Raw statistics, satisfying the change in capability requested in RFC 0001-metric-pre-defined-labels.  This allows programmers to predefine labels for all metrics.  This is not only an important potential optimization for the programmer, it is a usability improvement in the code.
+
+There are no new requirements stated in this RFC.
+
+## Internal details
+
+The type known as `MeasureMetric` is a direct replacement for Raw statistics.  The `MeasureMetric.Record` method records a single observation of the metric.  The `MeasureMetric.GetOrCreateTimeSeries` supports pre-defined keys as discussed in 0001-metric-pre-defined-labels.
+
+## Trade-offs and mitigations
+
+This change, while it eliminates the need for a Raw statistics concept, potentially introduces new required concepts.  Whereas Raw statistics have no directly-declared aggregations, introducing MeasureMetric raises the question of which aggregations apply.  We will propose how a programmer can declare recommended aggregations (and good defaults) in RFC 0004-configurable-aggregation.
+
+## Prior art and alternatives
+
+This Measure Metric API is conceptually close to the Prometheus [Histogram, Summary, and Untyped metric types](https://prometheus.io/docs/concepts/metric_types/).
+
+## Open questions
+
+With this proposal accepted, there would be three Metric types: Gauge, Cumulative, and Measure.  This proposal does not directly address what to do over the existing, conflicting uses of "Measure".
+
+## Future possibilities
+
+This change enables metrics to support configurable aggregation types, which allows the programmer to provide recommended aggregations at the point where Metrics are defined.  This will allow support for good out-of-the-box behavior for metrics defined by third-party libraries, for example.
+
+Without Raw statistics in the API, it becomes possible to elimiante the low-level `stats.Record` API, which may also be desireable.
diff --git a/0003-eliminate-stats-record.md b/0003-eliminate-stats-record.md
@@ -0,0 +1,36 @@
+# Eliminate stats.Record functionality
+
+Remove `stats.Record` from the specification, following the MeasureMetric type (RFC 0002-metric-measure).
+
+## Motivation
+
+`stats.Record` is no longer a necessary interface. There are conceivable reasons to support it, but they are outweighed by the cost of implementing and supporting two interfaces for recording metrics and statistics.
+
+## Explanation
+
+In RFC 0002-metric-measure, a new MeasureMetric type is introduced to replace raw statistics, with support for pre-defined label values.  With the new type introduced, it's now possible to record formerly-raw statistics through a higher-level Metric interface.
+
+## Internal details
+
+This simply involves removing the low-level `stats.Record` API from the specification, as it is no longer required.
+
+## Trade-offs and mitigations
+
+There are two reasons to maintain a low-level API that we know of:
+
+1. For _generality_.  An application that forwards metrics from another source may need to handle metrics in generic code.  For these applications, having type-specific Metric handles could actually require more code to be written, whereas the low-level `stats.Record` API is more amenable to generic use.
+1. For _atomicity_.  An application that wishes to record multiple statistics in a single operation can feel confident computing formulas based on multiple metrics, not worry about inconsistent views of the data.
+
+## Prior art and alternatives
+
+Raw statistics were a solution to confusion found in existing metrics APIs over Metric types vs. Aggregation types.  This proposal accompanies RFC 0001-metric-pre-defined-labels and RFC 0002-metric-measure.md in proposing that we think about Metric _type_ as independent of which aggregations apply.  Once we have a Metric to support histogram and summary aggregations, we no longer need raw statistics, and we no longer need `stats.Record`.  This avoids introducing new concepts (Raw statistics), at the same time departs from prior art in letting one Metric type support both Histogram and Summary aggregations.
+
+## Open questions
+
+Are either of the trade-offs described above important enough to keep the low-level `stats.Record` API?
+
+## Future possibilities
+
+This restricts future possibilities for the benefit of a smaller, simpler specification.
+
+This leaves open the possibility of adding `stats.Record` functionality later, when the need is more clearly recognized.
diff --git a/0004-metric-configurable-aggregation.md b/0004-metric-configurable-aggregation.md
@@ -0,0 +1,73 @@
+# Let Metrics support configrable, recommended aggregations
+
+Let the user configure recommended Metric aggregations (SUM, COUNT, MIN, MAX, LAST_VALUE, HISTOGRAM, SUMMARY).
+
+## Motivation
+
+In the current API proposal, Metric types like Gauge and Cumulative are mapped into specific aggregations: Gauge:LAST_VALUE and Cumulative:SUM.  Depending on RFC 0002-metric-measure, which creates a new MeasureMetric type, this proposal introduces the ability to configure alternative, potentially multiple aggregations for Metrics.  This allows the MeasureMetric type to support HISTOGRAM and SUMMARY aggregations, as an alternative to raw statistics.
+
+## Explanation
+
+This proposal completes the elimination of Raw statistics by recognizing that aggregations should be independent of metric type.  This recognizes that _sometimes_ we have a cumulative but want to compute a histogram of increment values, and _sometimes_ we have a measure that has multiple interesting aggregations.
+
+Following this change, we should think of the _Metric type_ as:
+
+1. Indicating something about what kind of numbers are being recorded (i.e., the input domain, e.g., restricted to values >= 0?)
+   1. For Gauges: Something pre-computed where rate or count is not relevant
+   1. For Cumulatives: Something where rate or count is relevant
+   1. For Measures: Something where individual values are relevant
+1. Indicating something about the default interpretation, based on the action verb (Set, Inc, Record, etc.)
+   1. For Gauges: the action is Set()
+   1. For Cumulatives: the action is Inc()
+   1. For Measures: the action is Record()
+1. Unless the programmer declares otherwise, suggesting a default aggregation
+   1. For Gauges: LAST_VALUE is interesting, SUM is likely not interesting
+   1. For Cumulatives: SUM is interesting, LAST_VALUE is likely not interesting
+   1. For Measures: all aggregations apply, default is MIN, MAX, SUM, COUNT.
+
+## Internal details
+
+Metric constructors should take an optional list of aggregations, to override the default behavior.  When constructed with an explicit list of aggregations, the implementation may use this as a hint about which aggregations should be exported by default.  However, the implementation is not bound by these recommendations in any way and is free to control which aggregations that are applied.
+
+The standard defined aggregations are broken into two groups, those which are "decomposable" (i.e., inexpensive) and those which are not.
+
+The decomposable aggregations are simple to define:
+
+1. SUM: The sum of observed values.
+1. COUNT: The number of observations.
+1. MIN: The smallest value.
+1. MAX: The largest value.
+1. LAST_VALUE: The latest value.
+
+The non-decomposable aggregations do not have standard definitions, they are purely advisory.  The intention behind these are:
+
+1. HISTOGRAM: The intended output is a distribution summary, specifically summarizing counts into non-overlapping ranges.
+1. SUMMARY: This is a more generic way to request information about a distribution, perhaps represented in some vendor-specific way / not a histogram.
+
+## Example
+
+To declare a MeasureMetric,
+
+```
+   myMetric := metric.NewMeasureMetric(
+		   "ex.com/mymetric",
+	           metric.WithAggregations(metric.SUM, metric.COUNT),
+		   metric.WithLabelKeys(aKey, bKey))
+)
+```
+
+Here, we have declared a Measure-type metric with recommended SUM and COUNT aggregations (allowing to compute the average) with `aKey` and `bKey` as recommended aggregation dimensions.  While the SDK has full control over which aggregations are actually performed, the programmer has specified a good default behavior for the implementation to use.
+
+## Trade-offs and mitigations
+
+This avoids requiring programmers to use the `view` API, which is an SDK API, not a user-facing instrumentation API. Letting the application programmer recommend aggregations directly gives the implementation more information about the raw statistics. Letting programmers declare their intent has few downsides, since there is a well-defined default behavior.
+
+## Prior art and alternatives
+
+Existing systems generaly declare separate Metric types according to the desired aggregation.  Raw statistics were invented to overcome this, and the present proposal brings back the ability to specify an Aggregation at the point where a Metric is defined.
+
+## Open questions
+
+There are questions about the value of the MIN and MAX aggregations.  While they are simple to compute, they are difficult to use in practice.
+
+There are questions about the interpretation of HISTOGRAM and SUMMARY. The point of Raw statistics was that we shouldn't specify these aggregations because they are expensive and many implementations are possible.  This is still true. What is the value in specifying HISTOGRAM as opposed to SUMMARY?  How is SUMMARY different from MIN/MAX/COUNT/SUM, does it imply implementation-defined quantiles?