Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development: use wrangler locally (update NGINX/Dockerfile config) #10965

Merged
merged 8 commits into from
Feb 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion common
2 changes: 2 additions & 0 deletions dockerfiles/Dockerfile.wrangler
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FROM node:18.15
RUN npm install -g wrangler@3.21.0
219 changes: 219 additions & 0 deletions dockerfiles/force-readthedocs-addons.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
/*

Script to inject the new Addons implementation on pages served by El Proxito.

This script is ran on a Cloudflare Worker and modifies the HTML with two different purposes:

1. remove the old implementation of our flyout (``readthedocs-doc-embed.js`` and others)
2. inject the new addons implementation (``readthedocs-addons.js``) script

Currently, we are doing 1) only when users opt-in into the new beta addons.
In the future, when our addons become stable, we will always remove the old implementation,
making all the projects to use the addons by default.

*/

// add "readthedocs-addons.js" inside the "<head>"
const addonsJs =
'<script async type="text/javascript" src="/_/static/javascript/readthedocs-addons.js"></script>';
Comment on lines +17 to +18
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to test an addon we are developing, we can just change this URL to http://localhost:8000/readthedocs-addons.js while having npm run dev running and refresh the page (there is no need to restart Docker or anything) -- 💯

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, that there is a inv docker.up -w flag that enables webpack hosting. This would be a good pattern to follow eventually. Making this a manual operation means it won't be repeatable or easy to use.

The development workflow with Webpack dev server hosting these assets is really good and should probably be the default for core team.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm happy to create an issue to track this work. I don't think we have a pattern for "given an environment variable make a decision on an asset file". We will need to change the content of this file in particular, so, it may require some research.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// selectors we want to remove
// https://developers.cloudflare.com/workers/runtime-apis/html-rewriter/#selectors
const analyticsJs =
'script[src="/_/static/javascript/readthedocs-analytics.js"]';
const docEmbedCss = 'link[href="/_/static/css/readthedocs-doc-embed.css"]';
const docEmbedJs =
'script[src="/_/static/javascript/readthedocs-doc-embed.js"]';
const analyticsJsAssets =
'script[src="https://assets.readthedocs.org/static/javascript/readthedocs-analytics.js"]';
const docEmbedCssAssets =
'link[href="https://assets.readthedocs.org/static/css/readthedocs-doc-embed.css"]';
const docEmbedJsAssets =
'script[src="https://assets.readthedocs.org/static/javascript/readthedocs-doc-embed.js"]';
const docEmbedJsAssetsCore =
'script[src="https://assets.readthedocs.org/static/core/js/readthedocs-doc-embed.js"]';
const badgeOnlyCssAssets =
'link[href="https://assets.readthedocs.org/static/css/badge_only.css"]';
const badgeOnlyCssAssetsProxied = 'link[href="/_/static/css/badge_only.css"]';
const readthedocsExternalVersionWarning = "[role=main] > div:first-child > div:first-child.admonition.warning";
const readthedocsFlyout = "div.rst-versions";

// "readthedocsDataParse" is the "<script>" that calls:
//
// READTHEDOCS_DATA = JSON.parse(document.getElementById('READTHEDOCS_DATA').innerHTML);
//
const readthedocsDataParse = "script[id=READTHEDOCS_DATA]:first-of-type";
const readthedocsData = "script[id=READTHEDOCS_DATA]";

// do this on a fetch
addEventListener("fetch", (event) => {
const request = event.request;
event.respondWith(handleRequest(request));
});

async function handleRequest(request) {
// perform the original request
let originalResponse = await fetch(request);

// get the content type of the response to manipulate the content only if it's HTML
const contentType = originalResponse.headers.get("content-type") || "";
const injectHostingIntegrations =
originalResponse.headers.get("x-rtd-hosting-integrations") || "false";
const forceAddons =
originalResponse.headers.get("x-rtd-force-addons") || "false";

// Log some debugging data
console.log(`ContentType: ${contentType}`);
console.log(`X-RTD-Force-Addons: ${forceAddons}`);
console.log(`X-RTD-Hosting-Integrations: ${injectHostingIntegrations}`);

// get project/version slug from headers inject by El Proxito
const projectSlug = originalResponse.headers.get("x-rtd-project") || "";
const versionSlug = originalResponse.headers.get("x-rtd-version") || "";

// check to decide whether or not inject the new beta addons:
//
// - content type has to be "text/html"
// when all these conditions are met, we remove all the old JS/CSS files and inject the new beta flyout JS

// check if the Content-Type is HTML, otherwise do nothing
if (contentType.includes("text/html")) {
// Remove old implementation of our flyout and inject the new addons if the following conditions are met:
//
// - header `X-RTD-Force-Addons` is present (user opted-in into new beta addons)
// - header `X-RTD-Hosting-Integrations` is not present (added automatically when using `build.commands`)
//
if (forceAddons === "true" && injectHostingIntegrations === "false") {
return (
new HTMLRewriter()
.on(analyticsJs, new removeElement())
.on(docEmbedCss, new removeElement())
.on(docEmbedJs, new removeElement())
.on(analyticsJsAssets, new removeElement())
.on(docEmbedCssAssets, new removeElement())
.on(docEmbedJsAssets, new removeElement())
.on(docEmbedJsAssetsCore, new removeElement())
.on(badgeOnlyCssAssets, new removeElement())
.on(badgeOnlyCssAssetsProxied, new removeElement())
.on(readthedocsExternalVersionWarning, new removeElement())
.on(readthedocsFlyout, new removeElement())
// NOTE: I wasn't able to reliably remove the "<script>" that parses
// the "READTHEDOCS_DATA" defined previously, so we are keeping it for now.
//
// .on(readthedocsDataParse, new removeElement())
// .on(readthedocsData, new removeElement())
.on("head", new addPreloads())
.on("head", new addProjectVersionSlug(projectSlug, versionSlug))
.transform(originalResponse)
);
}

// Inject the new addons if the following conditions are met:
//
// - header `X-RTD-Hosting-Integrations` is present (added automatically when using `build.commands`)
// - header `X-RTD-Force-Addons` is not present (user opted-in into new beta addons)
//
if (forceAddons === "false" && injectHostingIntegrations === "true") {
return new HTMLRewriter()
.on("head", new addPreloads())
.on("head", new addProjectVersionSlug(projectSlug, versionSlug))
.transform(originalResponse);
}
}

// Modify `_static/searchtools.js` to re-enable Sphinx's default search
if (
(contentType.includes("text/javascript") ||
contentType.includes("application/javascript")) &&
(injectHostingIntegrations === "true" || forceAddons === "true") &&
originalResponse.url.endsWith("_static/searchtools.js")
) {
console.log("Modifying _static/searchtools.js");
return handleSearchToolsJSRequest(originalResponse);
}

// if none of the previous conditions are met,
// we return the response without modifying it
return originalResponse;
}

class removeElement {
element(element) {
console.log("Removing: " + element.tagName);
console.log("Attribute href: " + element.getAttribute("href"));
console.log("Attribute src: " + element.getAttribute("src"));
console.log("Attribute id: " + element.getAttribute("id"));
console.log("Attribute class: " + element.getAttribute("class"));
element.remove();
}
}

class addPreloads {
element(element) {
console.log("addPreloads");
element.append(addonsJs, { html: true });
}
}

class addProjectVersionSlug {
constructor(projectSlug, versionSlug) {
this.projectSlug = projectSlug;
this.versionSlug = versionSlug;
}

element(element) {
console.log(
`addProjectVersionSlug. projectSlug=${this.projectSlug} versionSlug=${this.versionSlug}`,
);
if (this.projectSlug && this.versionSlug) {
const metaProject = `<meta name="readthedocs-project-slug" content="${this.projectSlug}" />`;
const metaVersion = `<meta name="readthedocs-version-slug" content="${this.versionSlug}" />`;

element.append(metaProject, { html: true });
element.append(metaVersion, { html: true });
}
}
}

/*

Script to fix the old removal of the Sphinx search init.

Enabling addons breaks the default Sphinx search in old versions that are not possible to rebuilt.
This is because we solved the problem in the `readthedocs-sphinx-ext` extension,
but since those versions can't be rebuilt, the fix does not apply there.

To solve the problem in these old versions, we are using a CF worker to apply that fix on-the-fly
at serving time on those old versions.

The fix basically replaces a Read the Docs comment in file `_static/searchtools.js`,
introduced by `readthedocs-sphinx-ext` to _disable the initialization of Sphinx search_,
with the real JavaScript to initialize the search, as Sphinx does by default.
(in other words, it _reverts_ the manipulation done by `readthedocs-sphinx-ext`)

*/

const textToReplace = `/* Search initialization removed for Read the Docs */`;
const textReplacement = `
/* Search initialization manipulated by Read the Docs using Cloudflare Workers */
/* See https://github.com/readthedocs/addons/issues/219 for more information */

function initializeSearch() {
Search.init();
}

if (document.readyState !== "loading") {
initializeSearch();
}
else {
document.addEventListener("DOMContentLoaded", initializeSearch);
}
`;

async function handleSearchToolsJSRequest(originalResponse) {
const content = await originalResponse.text();
const modifiedResponse = new Response(
content.replace(textToReplace, textReplacement),
);
return modifiedResponse;
}
131 changes: 131 additions & 0 deletions dockerfiles/nginx/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
How NGINX proxy works
=====================

Read the Docs uses 3 different NGINX configurations;

web
This configuration is in charge of serving the dashboard application
on ``$NGINX_WEB_SERVER_NAME`` domain.
It listens at port 80 and proxies it to ``web`` container on port ``8000``,
where is the Django application running.

It also proxies assets files under ``/static/`` to the ``storage`` container
on port ``9000`` which is running MinIO (S3 emulator).

proxito
Its main goal is to serve documentation pages and handle 404s.
It proxies all the requests to ``proxito`` container on port ``8000``,
where the "El Proxito" Django application is running.
This application returns a small response with ``X-Accel-Redirect`` special HTTP header
pointing to a MinIO (S3 emulator) path which is used by NGINX to proxy to it.

Besides, the response from El Proxito contains a bunch of HTTP headers
that are added by NGINX to the MinIO response to end up in the resulting
response arriving to the user.

It also configures a 404 fallback that hits an internal URL on the
Django application to handle them correctly
(redirects and custom 404 pages, among others)

Finally, there are two special URLs configured to proxy the JavaScript files
required for Read the Docs Addons and serve them directly from a GitHub tag.

Note server is not exposed _outside_ the Docker internal's network,
and is accessed only via Wrangler. Keep reading to understand how it's connected.

wrangler
Node.js implementation of Cloudflare Worker that's in front of "El Proxito".
It's listening on ``$NGINX_PROXITO_SERVER_NAME`` domain and executes the worker
``force-readthedocs-addons.js``.

This worker hits ``proxito`` NGINX server listening at ``nginx`` container
on port ``8080`` to fetch the "original response" and manipulates it to
inject extra HTTP tags required for Read the Docs Addons (``meta`` and ``script``).



ASCII-art explanation
---------------------

.. I used: https://asciiflow.com/


Documentation page on ``$NGINX_PROXITO_SERVER_NAME``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. text::

┌──────────────── User
│ ▲
(documentation pages) │
│ │
│ │
▼ 80 │
┌────────────────┐ │
│ │ │
│ │ │
│ │ │
│ wrangler │ ──────────┘
│ │
│ │
│ │
└──────┬─────────┘ ┌──────────────┐ ┌────────────────┐
│ ▲ │ │ │ │
│ │ │ │ 9000│ │
│ └──────────────────── │ ├───────►│ │
│ │ NGINX │ │ MinIO (S3) │
└───────────────────────► │ │◄───────┤ │
8080 │ │ │ │
│ │ │ │
└─────┬────────┘ └────────────────┘
│ ▲
│ │
│ │
│ │
8000 ▼ │
┌────────┴─────┐
│ │
│ │
│ El Proxito │
│ │
│ │
└──────────────┘


Documentation page on ``$NGINX_WEB_SERVER_NAME``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


.. text::

User

(dashboard)
▼ 80
┌──────────────┐ ┌────────────────┐
│ │ │ │
│ │ 9000│ │
│ ├───────►│ │
│ NGINX │ │ MinIO (S3) │
│ │◄───────┤ │
│ │ │ │
│ │ │ │
└─────┬────────┘ └────────────────┘
│ ▲
│ │
│ │
│ │
8000 ▼ │
┌────────┴─────┐
│ │
│ │
│ web │
│ │
│ │
└──────────────┘
Loading