Naming issues for span structured data. #1423

najamelan · 2021-06-06T20:21:11Z

Bug Report

Version

test_tracing v0.1.0 (/data/doc/code/test_tracing)
├── tracing v0.1.26
│ ├── tracing-attributes v0.1.15 (proc-macro)
│ └── tracing-core v0.1.18
└── tracing-subscriber v0.2.18
├── tracing v0.1.26 ()
├── tracing-core v0.1.18 ()
├── tracing-log v0.1.2
│ └── tracing-core v0.1.18 ()
└── tracing-serde v0.1.2
└── tracing-core v0.1.18 ()

Platform

Linux desktop.home 5.12.8-arch1-1 #1 SMP PREEMPT Fri, 28 May 2021 15:10:20 +0000 x86_64 GNU/Linux

Description

Some of the first field names for structured data where type and name. I ran into some quirky issues with both, so I wanted to report those here and raise the question as to whether this is desired behavior.

Since `type` is a reserved keyword, my first attempt was `r#type`.

use tracing::* ;

fn main()
{
    let _sub = tracing_subscriber::fmt::Subscriber::builder()

        .json()
        .init()
   ;


   let _span = error_span!( "my_span", r#type = "i32", r#name = "Bobby" ).entered();

   info!( "shak yaving?" );
}

Produces:

{"timestamp":"Jun 06 21:39:20.690","level":"INFO","fields":{"message":"shak yaving?"},"target":"test_tracing","span":{"name":"Bobby","r#type":"i32","name":"my_span"},"spans":[{"name":"Bobby","r#type":"i32","name":"my_span"}]}

without json:

Jun 06 21:53:26.825  INFO my_span{type="i32" name="Bobby"}: test_tracing: shak yaving?

NOTES:

name field exists twice on the span. Turns out it's legal json, but what to make of it. Are there instructions for consumers of such data on what this means? It seems like the default formatter has some magic knowledge, as it figures out that my_span is the span name and Bobby is an actual name field.
r# get's removed from r#name, but not from r#type???

Using quotes

   let _span = error_span!( "my_span", "type" = "i32", "name" = "Bobby" ).entered();

Json:

{"timestamp":"Jun 06 22:03:06.993","level":"INFO","fields":{"message":"shak yaving?"},"target":"test_tracing","span":{"name":"Bobby","type":"i32","name":"my_span"},"spans":[{"name":"Bobby","type":"i32","name":"my_span"}]}

default fmt:

Jun 06 22:02:52.920  INFO my_span{type="i32" name="Bobby"}: test_tracing: shak yaving?

NOTES:

overall quotes seem better than r#, and it's what's suggested in the docs, fair play. It still has duplicate fields.

I would like to suggest:

for metadata, use reserved field names like __span_name instead of very common ones like name.
be consistent in removing r# from all keys the same.

The text was updated successfully, but these errors were encountered:

najamelan · 2021-06-07T09:21:23Z

Just looking into this a bit more, serde_json will only consider the last of duplicate keys. This means that in order to correctly extract information here we have to either find an alternative json deserialize which supports duplicate keys, or roll a manual one. ouch!

Would you take a pull request for __span_name? Would this be considered a breaking change? Are there other attributes but span name that get included in the json data with a very generic name?

hawkw · 2021-06-08T00:05:06Z

Would you take a pull request for __span_name? Would this be considered a breaking change?

Changing the JSON format is considered a breaking change. We could make a change like this now by making it an opt-in configuration option.

My two cents would be to use a naming scheme where names are prefixed with tracing_, like tracing_span, rather than __xxx --- tracing_xxx makes it clear where the fields are coming from, But, maybe what we should do is add a builder parameter to add an arbitrary user-defined prefix to metadata field names in the JSON output? That way, we can avoid bikeshedding over what naming scheme to use :)

najamelan · 2021-06-08T05:34:43Z

But, maybe what we should do is add a builder parameter to add an arbitrary user-defined prefix to metadata field names in the JSON output? That way, we can avoid bikeshedding over what naming scheme to use :)

Personally I would say less is more... People using structured data are using tracing, so can be told certain keys are reserved. I think right now what's missing is clear recommendation. For example I'm using tracing in a lib. What can I count on? Will subscribers print span data? I can't count on it. They have the freedom not to, and several do, which means right now I have found no way of having sensical logs on Wasm. And as a lib author I don't control what subscriber get's used even if there was one that worked on Wasm.

Freedom is great, but I think for something like this convention/recommendation would be greater. And I think the tracing API is already very complicated.

Changing the JSON format is considered a breaking change. We could make a change like this now by making it an opt-in configuration option.

If a conclusion would be reached, I'd be up for filing a PR. Would it be a config option on the subscriber, or a crate feature? Is there a milestone/date for breaking changes on tracing-subscriber?

jyn514 · 2021-06-15T23:51:57Z

I would find it useful to be able to make arbitrary changes to the field names instead of having them hard-coded. In particular, an internal tool expects the level to be called log_level instead of level; adding a tracing prefix doesn't help and adding a custom log_ prefix isn't what I want because message should stay the same.

jyn514 · 2021-06-16T00:00:15Z

That API could look like this:

impl SubscriberBuilder<JsonFields, Format<Json>> {
    fn with_renamed_field(original_name: &'static str, new_name: &'static str) -> Self;
}

najamelan · 2021-06-16T06:23:45Z

@jyn514 Thanks for sharing. I do think it is a bit orthogonal to this issue. They will still need to have a default name, and that default name should still be unique enough so it doesn't overlap with common choices by users.

Also the need to pre-process the json before it is serialized could be generalized better with an API that gives you &mut serde_json::Value or something like that. So you don't just get to change the names.

jyn514 · 2021-06-16T12:34:27Z

Also the need to pre-process the json before it is serialized could be generalized better with an API that gives you &mut serde_json::Value or something like that. So you don't just get to change the names.

This is a great idea! That makes it much more flexible, something like this :)

impl SubscriberBuilder<JsonFields, Format<Json>> {
    fn inspect(record: &mut serde_json::Value) -> Self;
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming issues for span structured data. #1423

Naming issues for span structured data. #1423

najamelan commented Jun 6, 2021

najamelan commented Jun 7, 2021

hawkw commented Jun 8, 2021

najamelan commented Jun 8, 2021 •

edited

Loading

jyn514 commented Jun 15, 2021

jyn514 commented Jun 16, 2021

najamelan commented Jun 16, 2021 •

edited

Loading

jyn514 commented Jun 16, 2021

Naming issues for span structured data. #1423

Naming issues for span structured data. #1423

Comments

najamelan commented Jun 6, 2021

Bug Report

Version

Platform

Description

Since type is a reserved keyword, my first attempt was r#type.

Using quotes

najamelan commented Jun 7, 2021

hawkw commented Jun 8, 2021

najamelan commented Jun 8, 2021 • edited Loading

jyn514 commented Jun 15, 2021

jyn514 commented Jun 16, 2021

najamelan commented Jun 16, 2021 • edited Loading

jyn514 commented Jun 16, 2021

Since `type` is a reserved keyword, my first attempt was `r#type`.

najamelan commented Jun 8, 2021 •

edited

Loading

najamelan commented Jun 16, 2021 •

edited

Loading