Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error flagging with status codes #136

Merged
merged 33 commits into from
Sep 17, 2020
Merged
Changes from 1 commit
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
bd0177d
Add error flagging proposal
tedsuo Sep 9, 2020
54121cd
removed half finished sentence
tedsuo Sep 9, 2020
725cbc0
Remove error.message, it is redundant with the status message.
tedsuo Sep 9, 2020
ae86379
whitespace
tedsuo Sep 9, 2020
74dd7f5
whitespace
tedsuo Sep 9, 2020
0d34ea1
clarify convenience methods are in a seprate package
tedsuo Sep 9, 2020
d951674
whitespace
tedsuo Sep 9, 2020
c814974
whitespace
tedsuo Sep 9, 2020
e787f94
use correct RFC language
tedsuo Sep 9, 2020
3a25b4c
spelling
tedsuo Sep 9, 2020
4a42c3a
spellingUpdate text/trace/0000-error_flagging.md
tedsuo Sep 9, 2020
d204bc7
capitalization
tedsuo Sep 9, 2020
f551758
Capitalization
tedsuo Sep 9, 2020
44b1fa4
error mapping -> status mapping
tedsuo Sep 9, 2020
bb1bfb6
whitespace
tedsuo Sep 9, 2020
ccc0f5e
replace override status codes with user_override boolean
tedsuo Sep 9, 2020
241e2aa
rewrite based on error WG feedback
tedsuo Sep 10, 2020
db506d6
indicate that status source is a new field.
tedsuo Sep 10, 2020
7e5ef2b
remove some garbage
tedsuo Sep 10, 2020
6fb4bcb
Consolidate OPERATOR and APPLICATION into USER
tedsuo Sep 10, 2020
639e661
spelling
tedsuo Sep 10, 2020
0c8899b
nominal -> normal
tedsuo Sep 10, 2020
ec77065
markdownlint
tedsuo Sep 11, 2020
788844d
Add PR number to file name
tedsuo Sep 11, 2020
ed50989
more lint
tedsuo Sep 11, 2020
a6de9cc
clarify end users
tedsuo Sep 15, 2020
19a3d23
clarify end user
tedsuo Sep 15, 2020
cc8f305
clarify terms, update old intro
tedsuo Sep 15, 2020
2532107
fix intro
tedsuo Sep 15, 2020
7de13e9
clarify the meaning of normal
tedsuo Sep 15, 2020
1537cc3
clarify status mapping
tedsuo Sep 15, 2020
2b9e943
Update text/trace/0136-error_flagging.md
tedsuo Sep 16, 2020
76f0597
final nits
tedsuo Sep 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
error mapping -> status mapping
  • Loading branch information
tedsuo committed Sep 9, 2020
commit 44b1fa4bf1712d761a79e6fd46f87f9f8c1246e8
14 changes: 7 additions & 7 deletions text/trace/0000-error_flagging.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@ The following status codes are added to our current schema.
Analysis tools MAY disregard status codes, in favor of their own approach to error analysis. However, it is strongly suggested that analysis tools SHOULD handle `OK_OVERRIDE` and `ERROR_OVERRIDE`, as these are explicitly set by the end-user and contain valuable information.


### Error Mapping Schema
### Status Mapping Schema
As part of the specification, OpenTelemetry provides a canonical mapping of semantic conventions to status codes. This removes any ambiguity as to what OpenTelemetry ships with out of the box.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

### Error Processor
The collector will provide a processor and a configuration language to make adjustments to this error mapping schema. This provides the flexibility and customization needed for real world scenarios.
### Status Processor
The collector will provide a processor and a configuration language to make adjustments to this status mapping schema. This provides the flexibility and customization needed for real world scenarios.

### Convenience methods
tedsuo marked this conversation as resolved.
Show resolved Hide resolved
As a convenience, OpenTelemetry provides helper functions for adding semantic conventions and exceptions to a span. These helper functions will also set the correct status code. This simplifies the life of the instrumentation author, and helps ensure compliance and data quality.
Expand All @@ -43,7 +43,7 @@ Note that these convenience methods simply wire together multiple API calls. The
Except for the renaming of two status codes, this proposal is backwards compatible with existing code, protocols, and the OpenTracing bridge.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

**OK is renamed to DEFAULT**
Using the term “OK” as the default implies more meaning than we intend. The span is not necessarily OK - it simply has not triggered our standard error mapping. The default status code should be renamed to `DEFAULT` to avoid confusion.
Using the term “OK” as the default implies more meaning than we intend. The span is not necessarily OK - it simply has not triggered our standard status mapping. The default status code should be renamed to `DEFAULT` to avoid confusion.

Note: I intentionally avoided terms like "unset" as it may imply to users that they are required to set the status code, which is not the intention.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

Expand Down Expand Up @@ -75,12 +75,12 @@ If we add error processing to the Collector, it is unclear what the overhead wou
It is also unclear what the cost is for backends to scan for errors on every span, without a hint from instrumentation that an error might be present.

## Prior art and alternatives
In OpenTracing, the lack of a Collector and error mapping schema proved to be unwieldy. It placed a burden on instrumentation plugin authors to set the flag correctly, and led to an explosion of non-standardized configuration options in every plugin just to adjust the default error flagging. This in turn placed a configuration burden on application developers.
In OpenTracing, the lack of a Collector and status mapping schema proved to be unwieldy. It placed a burden on instrumentation plugin authors to set the error flag correctly, and led to an explosion of non-standardized configuration options in every plugin just to adjust the default error flagging. This in turn placed a configuration burden on application developers.

An alternative is the `error.hint` proposal, paired with the removal of status code. This would work, but essentially provides the same mechanism provided in this proposal, only with a large number of breaking changes. It also does not address the need for user overrides.

## Future Work

The inclusion of status codes and error mappings help the OpenTelemetry community speak the same language in terms of error reporting. It lifts the burden on future analysis tools, and (when respected) it allows users to employ multiple analysis tools without having to synchronize an important form of configuration across multiple tools.
The inclusion of status codes and status mappings help the OpenTelemetry community speak the same language in terms of error reporting. It lifts the burden on future analysis tools, and (when respected) it allows users to employ multiple analysis tools without having to synchronize an important form of configuration across multiple tools.

In the future, OpenTelemetry may add a control plane which allows dynamic configuration of the error mapping schema.
In the future, OpenTelemetry may add a control plane which allows dynamic configuration of the status mapping schema.