Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming issues for span structured data. #1423

Open
najamelan opened this issue Jun 6, 2021 · 7 comments
Open

Naming issues for span structured data. #1423

najamelan opened this issue Jun 6, 2021 · 7 comments

Comments

@najamelan
Copy link
Contributor

Bug Report

Version

test_tracing v0.1.0 (/data/doc/code/test_tracing)
├── tracing v0.1.26
│ ├── tracing-attributes v0.1.15 (proc-macro)
│ └── tracing-core v0.1.18
└── tracing-subscriber v0.2.18
├── tracing v0.1.26 ()
├── tracing-core v0.1.18 (
)
├── tracing-log v0.1.2
│ └── tracing-core v0.1.18 ()
└── tracing-serde v0.1.2
└── tracing-core v0.1.18 (
)

Platform

Linux desktop.home 5.12.8-arch1-1 #1 SMP PREEMPT Fri, 28 May 2021 15:10:20 +0000 x86_64 GNU/Linux

Description

Some of the first field names for structured data where type and name. I ran into some quirky issues with both, so I wanted to report those here and raise the question as to whether this is desired behavior.

Since type is a reserved keyword, my first attempt was r#type.

use tracing::* ;

fn main()
{
    let _sub = tracing_subscriber::fmt::Subscriber::builder()

        .json()
        .init()
   ;


   let _span = error_span!( "my_span", r#type = "i32", r#name = "Bobby" ).entered();

   info!( "shak yaving?" );
}

Produces:

{"timestamp":"Jun 06 21:39:20.690","level":"INFO","fields":{"message":"shak yaving?"},"target":"test_tracing","span":{"name":"Bobby","r#type":"i32","name":"my_span"},"spans":[{"name":"Bobby","r#type":"i32","name":"my_span"}]}

without json:

Jun 06 21:53:26.825  INFO my_span{type="i32" name="Bobby"}: test_tracing: shak yaving?

NOTES:

  • name field exists twice on the span. Turns out it's legal json, but what to make of it. Are there instructions for consumers of such data on what this means? It seems like the default formatter has some magic knowledge, as it figures out that my_span is the span name and Bobby is an actual name field.
  • r# get's removed from r#name, but not from r#type???

Using quotes

   let _span = error_span!( "my_span", "type" = "i32", "name" = "Bobby" ).entered();

Json:

{"timestamp":"Jun 06 22:03:06.993","level":"INFO","fields":{"message":"shak yaving?"},"target":"test_tracing","span":{"name":"Bobby","type":"i32","name":"my_span"},"spans":[{"name":"Bobby","type":"i32","name":"my_span"}]}

default fmt:

Jun 06 22:02:52.920  INFO my_span{type="i32" name="Bobby"}: test_tracing: shak yaving?

NOTES:

  • overall quotes seem better than r#, and it's what's suggested in the docs, fair play. It still has duplicate fields.

I would like to suggest:

  • for metadata, use reserved field names like __span_name instead of very common ones like name.
  • be consistent in removing r# from all keys the same.
@najamelan
Copy link
Contributor Author

Just looking into this a bit more, serde_json will only consider the last of duplicate keys. This means that in order to correctly extract information here we have to either find an alternative json deserialize which supports duplicate keys, or roll a manual one. ouch!

Would you take a pull request for __span_name? Would this be considered a breaking change? Are there other attributes but span name that get included in the json data with a very generic name?

@hawkw
Copy link
Member

hawkw commented Jun 8, 2021

Would you take a pull request for __span_name? Would this be considered a breaking change?

Changing the JSON format is considered a breaking change. We could make a change like this now by making it an opt-in configuration option.

My two cents would be to use a naming scheme where names are prefixed with tracing_, like tracing_span, rather than __xxx --- tracing_xxx makes it clear where the fields are coming from, But, maybe what we should do is add a builder parameter to add an arbitrary user-defined prefix to metadata field names in the JSON output? That way, we can avoid bikeshedding over what naming scheme to use :)

@najamelan
Copy link
Contributor Author

najamelan commented Jun 8, 2021

But, maybe what we should do is add a builder parameter to add an arbitrary user-defined prefix to metadata field names in the JSON output? That way, we can avoid bikeshedding over what naming scheme to use :)

Personally I would say less is more... People using structured data are using tracing, so can be told certain keys are reserved. I think right now what's missing is clear recommendation. For example I'm using tracing in a lib. What can I count on? Will subscribers print span data? I can't count on it. They have the freedom not to, and several do, which means right now I have found no way of having sensical logs on Wasm. And as a lib author I don't control what subscriber get's used even if there was one that worked on Wasm.

Freedom is great, but I think for something like this convention/recommendation would be greater. And I think the tracing API is already very complicated.

Changing the JSON format is considered a breaking change. We could make a change like this now by making it an opt-in configuration option.

If a conclusion would be reached, I'd be up for filing a PR. Would it be a config option on the subscriber, or a crate feature? Is there a milestone/date for breaking changes on tracing-subscriber?

@jyn514
Copy link
Contributor

jyn514 commented Jun 15, 2021

I would find it useful to be able to make arbitrary changes to the field names instead of having them hard-coded. In particular, an internal tool expects the level to be called log_level instead of level; adding a tracing prefix doesn't help and adding a custom log_ prefix isn't what I want because message should stay the same.

@jyn514
Copy link
Contributor

jyn514 commented Jun 16, 2021

That API could look like this:

impl SubscriberBuilder<JsonFields, Format<Json>> {
    fn with_renamed_field(original_name: &'static str, new_name: &'static str) -> Self;
}

@najamelan
Copy link
Contributor Author

najamelan commented Jun 16, 2021

@jyn514 Thanks for sharing. I do think it is a bit orthogonal to this issue. They will still need to have a default name, and that default name should still be unique enough so it doesn't overlap with common choices by users.

Also the need to pre-process the json before it is serialized could be generalized better with an API that gives you &mut serde_json::Value or something like that. So you don't just get to change the names.

@jyn514
Copy link
Contributor

jyn514 commented Jun 16, 2021

Also the need to pre-process the json before it is serialized could be generalized better with an API that gives you &mut serde_json::Value or something like that. So you don't just get to change the names.

This is a great idea! That makes it much more flexible, something like this :)

impl SubscriberBuilder<JsonFields, Format<Json>> {
    fn inspect(record: &mut serde_json::Value) -> Self;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants