Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linter: check consistency of dependent features #6571

Open
foolip opened this issue Aug 27, 2020 · 8 comments
Open

Linter: check consistency of dependent features #6571

foolip opened this issue Aug 27, 2020 · 8 comments
Labels
idle 🐌 Issues and pull requests with no recent activity linter 🏡 Issues or pull requests regarding the tests / linter of the JSON files.

Comments

@foolip
Copy link
Collaborator

foolip commented Aug 27, 2020

We currently have a 'subfeature_earlier_implementation' check, which would if for example the data claims that api.Document.title (document.title) was introduced before api.Document (document) itself.

This is an instance of one feature depending on another, but there are many more like this:

  • APIs that were introduced together, where shipping just one of them would make no sense, for example fetch() without a Response interface (however incomplete) would make no sense.
  • An interface that inherits from another, for example VTTCue inherits from TextTrackCue. api.VTTCue without api.TextTrackCue support is almost certainly an error.
  • An attribute of a certain type requires that type to exist. For example element.classList requires DOMTokenList.

More generalized, we have:

  • Implication: an entry/feature A that depends on another B, i.e., A ⇒ B and !B ⇒ !A (extremely common)
  • Equivalence: entries that must have matching support data, i.e., A ⇔ B (fairly rare)

A lot of this could be derived from Web IDL definitions. The lint for this could have a set of version dependencies, something like this:

[
  ["api.VTTCue", ">=", "api.TextTrackCue"],
  ["api.WindowOrWorkerGlobalScope.fetch", "==", "api.Response"]
]

This was inspired by #6564.

@foolip foolip added the linter 🏡 Issues or pull requests regarding the tests / linter of the JSON files. label Aug 27, 2020
@foolip
Copy link
Collaborator Author

foolip commented Aug 27, 2020

@ddbeck has something like this been discussed in the past?

@vinyldarkscratch FYI

@Elchi3
Copy link
Member

Elchi3 commented Aug 27, 2020

This came up in the context of HTML features that are reflected in the HTML DOM. (e.g. api.HTMLOListElement == html.elements.ol)

At the Mozilla Paris BCD hackathon, we then brainstormed this further. An interesting issue is #813 which saw two attempts to solve it: #1681 and #1799.

However, these previous PRs attempted to change the data model and proposed to introduce data pointers (or data imports). I decided that it was too much of a risky change at the time. Re-thinking this as a linter rule might be a better approach (to increase data quality, but to leave data structures alone [to not break bulk update tools, data consumers, and to avoid complicating reviews]).

@ddbeck
Copy link
Collaborator

ddbeck commented Aug 27, 2020

Florian has the right background story here.

I also like the idea of treating this as a linting problem rather than a schema problem, at least to start with. One thing I've learned is that it's much easier to add something to the schema than it is to back it out again (e.g., adding #3366, which never really took off).

I also like the declarative description you've come up for this. The dependency relationships could live in a JSON file even, so we could add or remove feature dependencies without having to modify the linter code itself. We could even experiment with additional tooling based on that configuration.

(An example: some things wouldn't necessarily map perfectly, such as notes. If a PR modified a note on feature with a dependency relationship, it'd be great to have GitHub Actions comment with a list of relevant feature dependencies, to make sure no additional changes are needed in those features. Building this into the linter feels impossible, but reading from a common config sounds much more tractable.)

All that said, I am a bit reluctant to suggest adding this kind of check directly the linter, as it's written. The linter is very much structured around walking files. This is really an interrogation of the meaning of the data itself though. I'm imagining writing this now and it sounds like a real opportunity to try making a more sophisticated set of APIs for querying the data, following relationships between features, and collecting output. I think this is a good candidate for building something that starts fresh to a certain degree. It'd be really exciting to see a high-level sketch of an implementation, as if requisite APIs (like querying arbitrary features) already existed.

@foolip
Copy link
Collaborator Author

foolip commented Aug 28, 2020

You're right, this couldn't work on files, but would be a consistency check of BCD as a whole. What kind of changes are you thinking of that would make this easier? Based on my own experiments in updating BCD from (external) scripts like mdn-confluence and mdn-bcd-collector, I have a few ideas that are sort of related.

Here's what I have in mind. We need a representation of BCD that preserves where each bit of data comes from, so that it's possible to generate the right error messages, but crucially for scripted updates, also to write exactly the right bits of the data back to disk. (The added javascript/builtins/Intl.json and javascript/builtins/WebAssembly.json in #6526 are an example of this going wrong when managed by the external script.)

So, parse all of the data files and keep their data separate. On top of that, build facilities to iterate all data in a way that's as close to what the BCD node module already exposes as possible. Also, as you say, a way to query the data, definitely by "path" like "api.Attr.localName", but possibly other ways too.

If it sounds like we're thinking of roughly similar things, then I'd love to spend some time collaborating on a design.

@ddbeck
Copy link
Collaborator

ddbeck commented Aug 28, 2020

Here's what I have in mind. We need a representation of BCD that preserves where each bit of data comes from, so that it's possible to generate the right error messages, but crucially for scripted updates, also to write exactly the right bits of the data back to disk.

Yeah, we're definitely on the same page here. But you summarized it better than I would have.

For me, this has been heavily informed by writing linting for MDN content. There's a really nice set libraries we've used, Unified, for working with HTML and Markdown. Something I cannot overrate is how nice it is to inspect the abstract document tree and log problems against the meaningful units in a document—a paragraph, a link, a section, etc.—without having to also reason about how to relate that to a line number.

If it sounds like we're thinking of roughly similar things, then I'd love to spend some time collaborating on a design.

👍 👍

@foolip
Copy link
Collaborator Author

foolip commented Aug 28, 2020

Cool, I've moved this discussion to #6585 and dumped more ideas than perhaps anyone wanted to read :) Let's continue there!

@foolip
Copy link
Collaborator Author

foolip commented Jun 10, 2021

Manually doing consistency checks comes up a lot when reviewing @vinyldarkscratch's PRs for #6369, I would say it's 80% of my time spend reviewing.

I think my proposal in #6571 (comment) of just pairing features with either a "must be equal" or "A implies B" rule is pretty good. Would there be appetite for introducing just that limited support, without worrying for now about how those pairs are generated?

@ddbeck
Copy link
Collaborator

ddbeck commented Jun 10, 2021

I'm happy to see something like this introduced. Seems to me that you could implement it as a standalone check (using the query() util to avoid traversing everything), like we did for the spec URL allowlist, defering the whole linter architecture problem in the short term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idle 🐌 Issues and pull requests with no recent activity linter 🏡 Issues or pull requests regarding the tests / linter of the JSON files.
Projects
None yet
Development

No branches or pull requests

4 participants
@ddbeck @Elchi3 @foolip and others