Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve extensions telemetry API #160090

Closed
jrieken opened this issue Sep 5, 2022 · 12 comments
Closed

Improve extensions telemetry API #160090

jrieken opened this issue Sep 5, 2022 · 12 comments
Assignees
Labels
api api-finalization feature-request Request for new features or functionality on-release-notes Issue/pull request mentioned in release notes on-testplan telemetry Telemetry system issues
Milestone

Comments

@jrieken
Copy link
Member

jrieken commented Sep 5, 2022

In todays API there are some symbols that help extensions with telemetry, e.g machineId, sessionId, and most importantly isTelemetryEnabled. These are used by our telemetry module and by others.

The existing APIs help you to get the job done but offer little quality of life, aren't always easy to use, and lack hooks that we would like for transparency and cleansing. Our goal is to make it easy for extensions to do the right thing and to be transparent to our user what's happening.

Below is an alternative API that is around a TelemetryLogger and TelemetryAppender. The appender is what does the actual data collecting but it should be used through a logger. With that "indirection" we can do many nice things:

  • we can do extra PII cleansing for extensions, e.g remove everything that looks like path-name, URI, user-name etc
  • we can enable/disable telemetry per extension
  • we echo everything that is logged to the telemetry output channel
  • we check enablements: logger.logUsage() -> setting say NO ⛔ -> we don't call appender
  • we can call logError on provider-errors and unhandled extension errors (superseeding Allow extensions to provide a global error handler for errors in extension code #45264)
declare module 'vscode' {

	export interface TelemetryLogger {
		readonly onDidChangeEnableStates: Event<TelemetryLogger>;
		readonly isUsageEnabled: boolean;
		readonly isErrorsEnabled: boolean;

                // calls TelemetryAppender#logUsage iff usage telemetry is enabled, echos data to output channel
                // called by extension
		logUsage(...args: any[]): void;

                // calls TelemetryAppender#logError iff error telemetry is enabled, echos data to output channel
                // called by extension
                // ALSO called by VS Code when the owning extension throws, e.g provider or command errors
		logError(err: Error): void;
	}

	export interface TelemetryAppender {
		logUsage(...args: any[]): void;
		logError(err: Error): void;
	}

	export namespace env {
		// BYOTA
		export function createTelemetryLogger(appender: TelemetryAppender): TelemetryLogger;
	}
}

The vscode-extension-telemetry-module would become an instance of a telemetry appender.

@jrieken jrieken added telemetry Telemetry system issues under-discussion Issue is under discussion for relevance, priority, approach feature-request Request for new features or functionality labels Sep 5, 2022
@lramos15 lramos15 added this to the September 2022 milestone Sep 8, 2022
@lramos15
Copy link
Member

lramos15 commented Sep 8, 2022

Assigning to September milestone but definitely will not be complete for next release

This was referenced Sep 8, 2022
@isidorn
Copy link
Contributor

isidorn commented Sep 23, 2022

Once we get this API in a good proposed state let me know and I can reach out to all extension authors that report telemetry so they slowly start transitioning to this API.

@lramos15 lramos15 modified the milestones: September 2022, October 2022 Sep 26, 2022
@lramos15 lramos15 added the api label Oct 4, 2022
@lramos15
Copy link
Member

This API continues to look good and adapt. Hoping to have some in product testing in for debt week. Just missed getting that in this week.

@lramos15 lramos15 modified the milestones: October 2022, November 2022 Oct 24, 2022
@gjsjohnmurray
Copy link
Contributor

Is per-extension settings going to be part of the initial release?

@lramos15
Copy link
Member

Is per-extension settings going to be part of the initial release?

It is not. Just shared output channel, cleaning done by VS code core, and logging of ext host errors caused by your extension to your registered telemetry logger

@bwateratmsft
Copy link
Contributor

I really like this idea, especially unifying all of the PII masking into one better-maintained place. One thing I wanted to bring up, though--there will always be examples of PII that the default filters can't catch. These could be domain-specific things like resource names, or more generic things like an obscure string ID format that isn't well known--perhaps an ID that's unique to a particular country or region.

As such, would it be possible to include a means of adding custom PII filtering logic? e.g. a method or list of methods to pass things through that input a string and output an amended string (or the same string if no changes are needed)?

@lramos15
Copy link
Member

As such, would it be possible to include a means of adding custom PII filtering logic? e.g. a method or list of methods to pass things through that input a string and output an amended string (or the same string if no changes are needed)?

I don't think there are any plans now for anything like this. What would you expect this to look like? Is it a regex that can be run on all the values and you pass a regex and a replacement value sort of similar to the ReplacementOptions of the current module? I think right now our story for these advanced cases is that your TelemetryAppender (names subject to change) which you contribute and received a TelemetryLogger back is called last and therefore you can technically intercept and do anything with the data payload after it is returned to you in order to be sent. The downside being we log what is being sent before it goes off to your TelemetryAppender meaning the specialty PII that you're not sending is being logged to the output channel.

@bwateratmsft
Copy link
Contributor

A regex + replacement is one option but it might be more flexible to accept a function that does string in -> string out. I think the issue of it being logged before going to the TelemetryAppender is significant; what the user sees logged, they are going to think is uploaded--and that belief would be reinforced if some of the PII is masked (the stuff the core library knows about), but not the additional things supplied by the extension.

@lramos15
Copy link
Member

lramos15 commented Nov 22, 2022

@bwateratmsft We discussed this today at the API sync and still are having a bit of hard time understanding how this can help.

This would be the current flow. I don't see how moving the your cleaning piece into our portion of the flow helps at all. We don't clean common properties because they're static and therefore should always be void of PII. So therefore we would just be calling your function right after you call log anyways. This means it's better for you to just call your cleaning and then call log.
Your cleaning -> you call log -> our cleaning -> we mix-in common properties -> output channel -> logged to endpoint

@bwateratmsft
Copy link
Contributor

My main motivation is to have all the cleaning happen in one place, even if the logic for what is cleaned is somewhat dispersed. Your approach definitely works, though.

@jrieken
Copy link
Member Author

jrieken commented Jan 23, 2023

Done!

@jrieken jrieken closed this as completed Jan 23, 2023
@gjsjohnmurray
Copy link
Contributor

@lramos15 @jrieken Please see #171996 (comment)

Do extensions need to be fixed?

@lramos15 lramos15 removed the under-discussion Issue is under discussion for relevance, priority, approach label Jan 23, 2023
@lramos15 lramos15 mentioned this issue Jan 23, 2023
2 tasks
@lramos15 lramos15 added the on-release-notes Issue/pull request mentioned in release notes label Jan 28, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Mar 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api api-finalization feature-request Request for new features or functionality on-release-notes Issue/pull request mentioned in release notes on-testplan telemetry Telemetry system issues
Projects
None yet
Development

No branches or pull requests

6 participants