Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[build] Update makefiles and CI config to enable W3C autopublishing #1730

Merged
merged 4 commits into from
Feb 28, 2024

Conversation

sideshowbarker
Copy link
Collaborator

No description provided.

@sideshowbarker sideshowbarker force-pushed the sideshowbarker/enable-w3c-autopublishing branch from 7568764 to 311cea0 Compare February 17, 2024 11:48
@sideshowbarker sideshowbarker force-pushed the sideshowbarker/enable-w3c-autopublishing branch from 311cea0 to 9d2d273 Compare February 17, 2024 12:49
Copy link
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks, this looks good. My only concern is that the new publish-to-w3c action redoes everything from scratch. Installing the prerequisite software and doing the Sphinx build already takes significant time on CI. To reduce turnaround time and save resources, wouldn't it be better to move the new build steps to the existing build targets, so that it reuses their work? That would also avoid failing twice in case the build has errors.

@sideshowbarker
Copy link
Collaborator Author

Short answer:

  • problem: we need to run Bikeshed twice on each spec anyway, due to the W3C TR versions needing different formatting
  • formatting: if we changed to having the github.io versions use W3C TR, we’d not need to run Bikeshed twice
  • time cost: the CI job this PR adds is run in parallel with the existing jobs, and doesn’t increase the overall CI time at all
  • high resource usage: it’s true that the CI job this PR adds, as currently written, does roughly double the resource usage
  • reducing resource usage: to reduce resource usage a bit, I’ll try some GH Actions changes, then maybe makefile tweaks
Longer answer

To reduce turnaround time and save resources, wouldn't it be better to move the new build steps to the existing build targets, so that it reuses their work?

So the problem there is: Bikeshed needs to be run twice on each of the specs anyway — at least if we want the https://webassembly.github.io/spec/ Bikeshed versions to retain the “Editor’s Draft” formatting they have now, rather than instead having the formatting of those change to being the same “Working Draft” formatting that the https://www.w3.org/TR versions have.

If the editors and the groups chairs are were OK with instead changing the https://webassembly.github.io/spec/ versions to have the same “Working Draft” formatting that the https://www.w3.org/TR/ versions have, then we’d need to run Bikeshed only once on each spec.

I’d personally be fine with the https://webassembly.github.io/spec/ versions having the “Working Draft” formatting — and as far as I’m aware, W3C policy doesn’t explicitly prohibit us from doing so. That said, I think that in general, W3C management would prefer that the “Working Draft” formatting only be used for the versions published at https://www.w3.org/TR/ URLs. But I think we could still try it if we wanted to, and see if anybody notices and asks us to stop doing it (in which case, we’d just flip it back).

But otherwise, to retain the “Editor’s Draft” formatting of the https://webassembly.github.io/spec/ versions means that we have to first run bikeshed once as it’s already being run — with Status: ED set in the .bs sources for each spec — and then to get the “Working Draft” formatting for the https://www.w3.org/TR/ versions, we need to run bikeshed again, but with the Status property instead set to WD.

My only concern is that the new publish-to-w3c action redoes everything from scratch. Installing the prerequisite software and doing the Sphinx build already takes significant time on CI.

As far as the time requirements, one thing to note is: the publish-to-w3c-TR CI job this PR adds is run in parallel with the existing CI jobs that do the builds and publishing for the existing https://webassembly.github.io/spec/ versions. And from watching all the jobs run for this PR branch, I can see that publish-to-w3c-TR job finishes are almost exactly the same time — roughly 6 minutes and 30 seconds — as the build-core-spec+publish-spec combination.

In other words, the existing CI takes about 6.5 minutes to complete, and the patch in this PR doesn’t increase that.

All that said, as far as resource usage, it’s true that the current patch in this PR as currently written does basically doubles the resource usage — and I guess there are few changes that could be made to decrease that.

As far as exactly what changes could be made: The opportunities to make changes to the makefiles to reduce the resource usage of the build itself are very few — because:

  • to get W3C autopublishing working, we’ll currently need to run bikeshed twice for each spec, regardless
  • the bikeshed target in the core/Makefile is what consumes most of the time and resources, and we must run it twice

However, I guess we could factor the Sphinx part of that bikeshed target into its own target — in which case, we could make the build run that part just once, only before the first time we run the Bikeshed part — and not re-run it a second time.

That’d probably cut 30 seconds out of the 6 minutes and 30 seconds the publish-to-w3c-TR job takes. So I’ll try it.

But the bigger savings would probably come from making changes to the workflows/ci-spec.yml file, to factor out the installing-the-prerequisite-software steps — Checkout repo, Setup Node.js, Setup Bikeshed, and Setup Sphinx —steps into their own prep job. And then we make the build-core-spec, build-js-api-spec, build-web-api-spec, and publish-to-w3c-TR jobs all require/depend on that prep job, so that it runs first, just once.

So I’ll try that first — both because it’s quicker and easier to try then refactoring the bikeshed makefile target is, and also because it should give a more-significant reduction in the overall resource usage (though not in the overall build time).

@sideshowbarker sideshowbarker force-pushed the sideshowbarker/enable-w3c-autopublishing branch from 9d2d273 to 9b4e7b7 Compare February 18, 2024 04:34
@sideshowbarker sideshowbarker force-pushed the sideshowbarker/enable-w3c-autopublishing branch 3 times, most recently from 4673795 to ddb1751 Compare February 18, 2024 04:55
@sideshowbarker
Copy link
Collaborator Author

But the bigger savings would probably come from making changes to the workflows/ci-spec.yml file, to factor out the installing-the-prerequisite-software steps — Checkout repo, Setup Node.js, Setup Bikeshed, and Setup Sphinx —steps into their own prep job. And then we make the build-core-spec, build-js-api-spec, build-web-api-spec, and publish-to-w3c-TR jobs all require/depend on that prep job, so that it runs first, just once.

OK, I tried that in ddb1751 but quickly discovered it doesn’t work — and won’t work. And for the record here, the reason is: each job runs on a different, new machine/runner. So there’s no way to preserve shared state/environment among jobs.

Apparently this is a fundamental characteristic of GitHub Actions that people generally know or are expected to know…

@sideshowbarker sideshowbarker force-pushed the sideshowbarker/enable-w3c-autopublishing branch from 2316c2a to f1f57b6 Compare February 18, 2024 05:47
@sideshowbarker
Copy link
Collaborator Author

sideshowbarker commented Feb 18, 2024

I guess we could factor the Sphinx part of that bikeshed target into its own target — in which case, we could make the build run that part just once, only before the first time we run the Bikeshed part — and not re-run it a second time.

So, given that for CI we can’t share any environment among jobs, that makefile change wouldn’t actually buy us anything useful for reducing CI resource usage. It’d instead just help someone locally running the W3C-echidna target. And in practice I’m the only person who’d ever being doing that. And so, seems to be no real value to doing that refactoring.

Copy link
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the biggest time & resource sink in CI is setting up the prerequisite software. But you can't share that across actions, you can only share artefacts via "upload". That's why I suggested just moving the new build steps into the existing build jobs and upload the results for the publication job, like with publish-spec.

But don't worry about it, I can have a stab at refactoring that later.

FWIW, I'd actually be fine just having the WD version on the github.io page. In fact, that's probably less confusing to users than seeing two differently looking documents on both sites and having to wonder how they relate to each other.

(Btw, please never force-push to a branch under review, as that breaks GH's review process, e.g., reviewers won't be be able to see relative diffs, code comments become orphaned, etc.)

@sideshowbarker
Copy link
Collaborator Author

Yes, the biggest time & resource sink in CI is setting up the prerequisite software. But you can't share that across actions, you can only share artefacts via "upload".

Yeah, I’d vaguely recalled that being the case, but it didn’t sink back in until today when I tried the ci-spec.yml refactoring.

That's why I suggested just moving the new build steps into the existing build jobs and upload the results for the publication job, like with publish-spec.

But don't worry about it, I can have a stab at refactoring that later.

OK. When you try it, at https://github.com/w3c/echidna/wiki/How-to-use-Echidna#tar-file you can find the current documentation on how the W3C publishing HTTP API works. But essentially it’s just going to be this:

curl 'https://labs.w3.org/echidna/api/request' -F "tar=@WD.tar" -F "token=<token>" -F "decision=<URL>"

…where:

  • WD.tar (or whatever name) is a tar archive that per https://github.com/w3c/echidna/wiki/Preparing-your-document#tar-file contains a single file named Overview.html, which is the Bikeshed output of either the Core, JS API, or Web API spec.
  • <token> is the value of the corresponding repository secret named W3C_ECHIDNA_TOKEN_CORE, W3C_ECHIDNA_TOKEN_JSAPI, or W3C_ECHIDNA_TOKEN_CORE already configured for this repo.
  • <URL> is https://github.com/WebAssembly/meetings/blob/main/main/2017/WG-12-06.md
  • The body of the HTTP response to that is a text/plain single line giving the unique ID the server assigned to the request.
  • https://labs.w3.org/echidna/api/status?id=<ID> with the given ID responds with success/failure JSON data+details.

So, I assume what the refactoring you have in mind would need to amount to is: the CI config would be changed to construct the necessary tar files (WD.tar or whatever) from the same spec artifacts already being downloaded by the publish-spec job, and then the calls to the API endpoint get made with those tarred-from-downloaded-artifacts tar files.

FWIW, I'd actually be fine just having the WD version on the github.io page.

OK — I guess the first step for that would just be changing the existing .bs files to have Status: WD.

In fact, that's probably less confusing to users than seeing two differently looking documents on both sites and having to wonder how they relate to each other.

Yeah, that’s how it seems to me as well.

Btw, please never force-push to a branch under review

OK

Copy link
Collaborator

@Ms2ger Ms2ger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like a lot of duplicated code in the makefiles, but let's get this to work first and refactor later if needed

@rossberg
Copy link
Member

@sideshowbarker, is there anything missing for merging this?

@sideshowbarker
Copy link
Collaborator Author

@sideshowbarker, is there anything missing for merging this?

Nope, nothing missing — it’s ready to be merged

@rossberg rossberg merged commit 89f7ced into main Feb 28, 2024
5 checks passed
@Ms2ger Ms2ger deleted the sideshowbarker/enable-w3c-autopublishing branch February 28, 2024 16:56
@rossberg
Copy link
Member

rossberg commented Mar 4, 2024

@sideshowbarker, today CI failed on the publish-to-w3c target for an unrelated PR:

https://github.com/WebAssembly/spec/actions/runs/8133065435/job/22233472624?pr=1737

Do you have an idea what the problem could be?

@sideshowbarker
Copy link
Collaborator Author

https://github.com/WebAssembly/spec/actions/runs/8133065435/job/22233472624?pr=1737#step:7:3005 says:

Must provide W3C_ECHIDNA_TOKEN_CORE and DECISION_URL environment variables

…but at https://github.com/WebAssembly/spec/blob/main/.github/workflows/ci-spec.yml#L124 , the CI build sets W3C_ECHIDNA_TOKEN_CORE — and in the makefile here:

DECISION_URL = https://github.com/WebAssembly/meetings/blob/main/main/2017/WG-12-06.md

DECISION_URL is hardcoded.

And this was working on the PR branch previously, so it still should be.

But I’ll investigate it further, and see what could be causing it to not find what it needs.

@sideshowbarker
Copy link
Collaborator Author

OK, first clue: Looking at https://pipelinesghubeus4.actions.githubusercontent.com/k4KMVVeRi9EEw2vuQiOaD2iDjWSjf8j5KDSnrjfdb62mXBfQf1/_apis/pipelines/1/runs/1014/signedlogcontent/2?urlExpires=2024-03-04T12%3A23%3A05.0263402Z&urlSigningMethod=HMACV1&urlSignature=UGaJewwi8Aq1yCtlo9EA1LTL5QbS246pUMl61Bjn2RI%3D raw log for the #1730 PR branch before we merged, I see:

2024-02-19T06:32:52.0833801Z env:
2024-02-19T06:32:52.0834120Z   STATUS: --md-status=WD
2024-02-19T06:32:52.0834795Z   W3C_ECHIDNA_TOKEN_CORE: ***
2024-02-19T06:32:52.0835312Z   W3C_ECHIDNA_TOKEN_JSAPI: ***
2024-02-19T06:32:52.0835787Z   W3C_ECHIDNA_TOKEN_WEBAPI: ***
2024-02-19T06:32:52.0836301Z   YARN_ENABLE_IMMUTABLE_INSTALLS: false

…that is, W3C_ECHIDNA_TOKEN_CORE: *** — three asterisks for the (secret) value.

But looking at the https://productionresultssa7.blob.core.windows.net/actions-results/30af3ce5-e11e-4563-9de1-5e3353d34795/workflow-job-run-fc640d00-b954-5c7a-fd14-93d46735bd61/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-03-04T12%3A35%3A20Z&sig=ukrXt2zBkNU6erLQB9OnzbzDiyLKl9l1APnloNKODeo%3D&sp=r&spr=https&sr=b&st=2024-03-04T12%3A25%3A15Z&sv=2021-12-02 raw log for the #1737 PR branch, I instead see:

2024-03-04T06:56:22.6245453Z env:
2024-03-04T06:56:22.6245790Z   STATUS: --md-status=WD
2024-03-04T06:56:22.6246276Z   W3C_ECHIDNA_TOKEN_CORE: 
2024-03-04T06:56:22.6246642Z   W3C_ECHIDNA_TOKEN_JSAPI: 
2024-03-04T06:56:22.6247056Z   W3C_ECHIDNA_TOKEN_WEBAPI: 
2024-03-04T06:56:22.6247539Z   YARN_ENABLE_IMMUTABLE_INSTALLS: false

…that is, no asterisks.

So, apparently GH Actions isn’t retrieving the W3C_ECHIDNA_TOKEN_CORE value as expected. But looking at the Repository secrets section of https://github.com/WebAssembly/spec/settings/secrets/actions, I can see that the W3C_ECHIDNA_TOKEN_CORE secret is still set. So it’s unclear why the CI build’s not finding it.

@sideshowbarker
Copy link
Collaborator Author

sideshowbarker commented Mar 4, 2024

OK, looking at https://github.com/WebAssembly/spec/settings/secrets/actions, I see these statements:

Anyone with collaborator access to this repository can use these secrets and variables for actions. They are not passed to workflows that are triggered by a pull request from a fork.

So, because the #1737 PR is from a fork, I guess it’s expected behavior that GH Actions for that PR won’t have access to the repository secrets.

If so, I think there’s no way to expose repo secrets to GH Actions for PRs from forks — it seems to be prevented by design.

Given that, I guess it means that we could only auto-publish to W3C TR for PRs from branches in the repo itself — which requires the authors to have commit/push perms for this repo — and for merges to the main branch.

@rossberg
Copy link
Member

rossberg commented Mar 4, 2024

Thanks for looking into it!

Now that you mention it, in fact PRs should probably never cause a push to the w3c repo, only actual pushes to the main branch should. That probably means that we'll have to split the publish-to-w3c job into a separate workflow file without the on: pull_request trigger. Would that fix the problem as well?

@sideshowbarker
Copy link
Collaborator Author

in fact PRs should probably never cause a push to the w3c repo, only actual pushes to the main branch should.

OK

That probably means that we'll have to split the publish-to-w3c job into a separate workflow file without the on: pull_request trigger. Would that fix the problem as well?

Yeah, it seems like it would

@sideshowbarker
Copy link
Collaborator Author

split the publish-to-w3c job into a separate workflow file without the on: pull_request trigger.

OK, opened #1738 for that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants