Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Match pattern matches #580

Open
Jack-Works opened this issue Mar 31, 2024 · 14 comments
Open

Proposal: Match pattern matches #580

Jack-Works opened this issue Mar 31, 2024 · 14 comments
Labels
needs-triage: firefox Firefox needs to assess this issue for the first time supportive: safari Supportive from Safari

Comments

@Jack-Works
Copy link

Sometimes I need to know if an origin matches a match pattern but there is no API so I need to write my own.

API Example:

browser.runtime.patternMatches("<all_urls>", "https://example.com/") // true
browser.runtime.patternMatches("https://*/foo*", "https://example.com/") // false
@github-actions github-actions bot added needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: firefox Firefox needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time labels Mar 31, 2024
@xeenon xeenon added supportive: safari Supportive from Safari and removed needs-triage: safari Safari needs to assess this issue for the first time labels Mar 31, 2024
@fregante
Copy link

Until this is accepted/merged, I've been maintaining a package that does exactly that: https://github.com/fregante/webext-patterns

The challenge has been that matching is browser- and context-dependent. Does <all_urls> match about:blank? What if about:blank is the source of a frame? How about data: and about:srcdoc? See also:

@tophf
Copy link

tophf commented Apr 1, 2024

  • isPatternMatchingUrl would be more self-evident
  • If it's synchronous the code to check the pattern should be duplicated from the browser's main process that currently handles the API into the process of the extension - is this code small enough to justify adding it to every extension's process memory footprint? For an asynchronous API it might make sense to accept an array to check a long list without incurring overhead.
  • Should it report an invalid pattern? With an asynchronous interface it's obviously a rejected Promise. With a synchronous interface it would be a) an exception but it's rather inconvenient to handle, b) null but it's non-explicit/vague, or c) the output should be an object like {result, error}.
  • A parameter to enable globs mode to test include_globs?

@xeenon
Copy link
Collaborator

xeenon commented Apr 2, 2024

I prefer @Jack-Works' proposed name of patternMatches or matchesPattern.

I think it would also be useful to expose some matching options, since sometimes things like the path or scheme should be ignored. WebKit has these options we could expose in an options dictionary to this API. We should also allow matching URL objects, and not just strings.

Another possible API around patterns would be creating a pattern from a URL, with options, so it is easy to make a pattern from a URL for things like registered content scripts or user scripts. WebKit has similar creation options that we could expose to this API, like matching exact schemes, or any path.

Going even further, you could have an API to check if a pattern is valid or supported. Safari does not support ftp or file patterns currently, and having an API to check this would be useful. (Even though ftp and file patterns are not supported, they are still valid (parsable).)

@dotproto
Copy link
Member

dotproto commented Apr 12, 2024

In yesterday's meeting browser engineers asked for use cases in order to better understand how and why this capability is necessary. @Jack-Works and @fregante, can you share more about your use cases?

@xeenon, in abstract I agree that it may make sense to expose these options, since different parts of the platform use match patterns in different ways, such as web accessable resources vs. content scripts. For ease of reference, the ‎WebExtensionMatchPattern::Options‎ enum @xeenon linked to currently has 3 fields: IgnoreSchemes, IgnorePaths, MatchBidirectionally ("matches two patterns in either direction (A matches B, or B matches A)"). I also agree that working against URLs would be ideal.

As I write this, a use case came to mind: checking whether or not the extension can inject a content script on a given domain based on current permission grants. Ostensibly permissions.contains() should answer this, but some browsers allow users to manually set which sites the extension can work on and the resulting host permission grant may not match what the developer declared in the manifest. For this case, it would be best if a developer could provide an array of match patterns to check against.

let {origins: matchPatterns} = await browser.permissions.getAll();

if (browser.runtime.patternMatches(matchPatterns, "https://example.com/page")) {
  browser.scripting.executeScript(/*...*/);
} else {
  showPermissionMissing();
}

While devs can work around not having array support, that approach it is notably less ergonomic.

let {origins} = await browser.permissions.getAll();

let hasPermission = false;
for (let matchPattern of origins) {
  if (browser.runtime.patternMatches(matchPattern, "https://example.com/page")) {
    hasPermission = true;
    break;
  }
}

if (hasPermission) {
  browser.scripting.executeScript(/*...*/);
} else {
  showPermissionMissing();
}

@dotproto
Copy link
Member

For the sake of making the shape of the API I'm currently imagining a bit more concrete, I took a pass at writing up an interface definition in WebIDL.

partial interface WebExtensionsRuntime {
  dictionary MatchesPatternSettings {
    boolean ignoreSchemes = false;
    boolean ignorePaths = false;
  };

  boolean matchesPattern((DOMString or sequence<DOMString>) pattern,
                         (DOMString or URL) url,
                         optional MatchesPatternSettings settings);
};

@Jack-Works
Copy link
Author

Yes, @dotproto your case is exactly what happened in our extension and that's why I opened this issue.

@xeenon
Copy link
Collaborator

xeenon commented Apr 13, 2024

@dotproto FWIW, Safari only reports active granted permissions in permissions.contains(), etc.

@fregante
Copy link

fregante commented Apr 15, 2024

Ostensibly permissions.contains() should answer this, [...] but the resulting host permission grant may not match what the developer declared in the manifest.

This sounds like a bug to me. I would expect permissions.contains() to be able to tell me if I have a specific permission, regardless of how the permission was granted/removed.

I generally use webext-patterns to answer the question: "does this glob match a tab?" The glob may come from anywhere, even provided by the final user, and can be used to filter tabs by URL. It doesn't have to be related to permissions.

A task could be "extract the page title from all the Wikipedia.org tabs currently open", so I'd probably query the list of open tabs, and filter them by glob (https://*.wikipedia.org/*)

@carlosjeurissen
Copy link
Contributor

Having such API would also mean authors are assured of the matching algorithm used by the browser. For example, Safari seems to not match on the query string. See: https://bugs.webkit.org/show_bug.cgi?id=273069

@tophf
Copy link

tophf commented Apr 22, 2024

A task could be "extract the page title from all the Wikipedia.org tabs currently open", so I'd probably query the list of open tabs, and filter them by glob (https://*.wikipedia.org/*)

FWIW, that's an anti-example because a more efficient solution is to use chrome.tabs.query({url: ['https://*.wikipedia.org/*']})

@carlosjeurissen
Copy link
Contributor

carlosjeurissen commented Apr 22, 2024

A task could be "extract the page title from all the Wikipedia.org tabs currently open", so I'd probably query the list of open tabs, and filter them by glob (https://*.wikipedia.org/*)

FWIW, that's an anti-example because a more efficient solution is to use chrome.tabs.query({url: ['https://*.wikipedia.org/*']})

Agreed! However one would still need to filter out tabs without an URL in some older Safari versions (Before Tech Preview 192).

The pattern could still be useful in situations where multiple match patterns are used and you want to know which pattern matched the URL. This could be part of this API design.

@xeenon
Copy link
Collaborator

xeenon commented Apr 22, 2024

@carlosjeurissen That bug is no longer an issue in WebKit and Safari Technology Preview 192.

@oliverdunk oliverdunk removed the needs-triage: chrome Chrome needs to assess this issue for the first time label Apr 23, 2024
@carlosjeurissen
Copy link
Contributor

@xeenon Wonderful! Updated the comment to reflect this.

@Sxderp
Copy link

Sxderp commented May 3, 2024

This has been requested before and denied. I'm glad that it is seeing new traction. It would be incredibly useful.

https://bugzilla.mozilla.org/show_bug.cgi?id=1395278
https://docs.google.com/document/d/1731b2kkN1wndNzVvo--5gwUcagbOSKGNYv4769r68NM/edit#heading=h.mkvz1txebobb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage: firefox Firefox needs to assess this issue for the first time supportive: safari Supportive from Safari
Projects
None yet
Development

No branches or pull requests

8 participants