Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary null values from metadata #3626

Closed
wants to merge 1 commit into from

Conversation

tomgasson
Copy link
Contributor

Hey folks 👋

We at Atlassian have some bundle size tooling that was flagging a larger query contributing about 200KB to our page load. That tooling isn't compression aware, so the actual size is more like 9KB when gzipped so it's not too big of a concern.

But it got me looking at what contributes to that size, and what we can do to reduce it. I noticed that the majority of metadata is null values. Whilst these compress really well, they still contribute a lot of unnecessary extra bytes over the wire. Removing these nulls reduces our gzipped metadata size by about 1/3.

The code in relay-runtime uses loose equality, either as field.X != null, or field.X ?? default, or if (field.X) {, which means that we can safely remove these fields over the wire without runtime changes.

As you can see by the stats on this PR, this removes a lot of LOC. Even though each individual impact is small, the number of queries / fragments involved mean it will probably have a decent production impact. For you at Meta, with persisted queries, it should look even better than our 1/3 reduction.

This PR addresses the following fields:

  • "alias": null
  • "args": null
  • "concreteType": null
  • "storageKey": null
  • "abstractKey": null
  • "metadata": null
  • "plural": false. Most fields are non-plural, so we can use non-plural as a default

Rust compiler

In the rust compiler, this was quite easy to achieve. Instead of building objects as object(items: Vec<ObjectEntry>), I've replaced these with object_filter_none(items: Vec<Option<ObjectEntry>>) which then filters out None values. Replacing Primitive::Null values with a None just means inverting where the test happens.

JS compiler

In the JS compiler, I've currently maintained the field order so that the snapshot diffs are clear. But I've also currently scattered a lot of $FlowFixMes there. Just want to check the approach first.

The JS compiler uses the same types as the runtime. Those types are all currently read-only, exact objects. This makes it a little difficult to "build" the fields / nodes

const field = {
    alias: null // works with read-only
}
if (needs_alias){
    field.alias = alias // needs to be writable now
}

There's a few options here:

  • Move the read-only from field variance (+alias: ?string) to the utility $ReadOnly<>.
  • Move all the $ReadOnly annotations off the shared types and into the runtime (so the runtime can't write fields, but the shared types become writable for relay-compiler to use)
  • Use $Shape<> to build the objects in relay-compiler and then assert their output types after building them (don't let non-exact or writable types escape the functions in ReaderCodeGenerator, NormalizationCodeGenerator, but treat them as in-exact / writable within).
  • Use object spreads (as some fields like handleArgs already do) and maybe stableSort the snapshots so the fields output consitently

Questions

This is a large PR right now. I'll break this down into clearer steps in order to ship it, but I wanted feedback before I go ahead with that. Questions:

  • Are you happy to accept a contribution to remove these nulls from the metadata sent to the client?
  • Which fields does this make sense for? I.e. plural might be stretching it a bit far.
  • Would you like me to feature flag this?
  • Any suggestions on how you'd like the Normalization* and Reader* types to look if I change them to be more amenable to allowing field writing in the JS compiler.

@josephsavona
Copy link
Contributor

josephsavona commented Nov 8, 2021

Thanks for the PR! We have also considered this as a potential optimization. The reason we haven't shipped this yet though is a hunch that this will not play well with JavaScript engines, which depend quite a bit on hidden classes for performance. The current output ensures that each node kind has a stable set of fields and field ordering, which means that each "instance" of a given node kind will (should) always get the same hidden class. This in turn allows the engine to more effectively monomorphize (specialize) code consuming those objects, ie Relay Runtime). As I mentioned though this is definitely a hunch - we don't currently have an easy way to test the performance impact of changes to the AST format at a scale representative of our production workloads. We're currently in the middle of building out a benchmark suite to help answer questions like this though, and we will definitely look at the perf impact of this change.

In terms of next steps: we're supportive of adding an off-by-default option to the compiler that enables omitting null fields. We are not actively using the JavaScript compiler, so our recommendation would be to make this change only in the Rust compiler, which we are hoping to make a supported option in OSS relatively soon. Thoughts?

@josephsavona
Copy link
Contributor

josephsavona commented Nov 8, 2021

In terms of next steps on implementation:

  • Revert changes to the JS compiler (or move them to a later PR, lets start with Rust)
  • Make this a compiler feature flag that defaults to off. Note that this will allow you to revert tests, since by default none of them should be getting the new behavior.
  • Add a new test that enables the feature, with a small selection of fixtures.

This should dramatically reduce the size of the PR :-)

@alunyov
Copy link
Contributor

alunyov commented Nov 8, 2021

RE: JS Changes. I think we can just ignore JS compiler at this point. The release of Rust-based version is coming soon: #3180

@tomgasson tomgasson force-pushed the minify branch 3 times, most recently from 463ae15 to e2681c4 Compare November 8, 2021 23:56
@tomgasson
Copy link
Contributor Author

Thanks for the feedback.

I've updated this PR to only make this change to the rust compiler. Feature flagged under enable_reduced_metadata.

I agree this might be a bytesize vs runtime tradeoff - interested to see the result.

Nice to hear the new compiler is getting close to release. Some small contribution meta:

  • It's not clear what rust version I should be on (so my Cargo.lock changes are probably not correct here)
  • It's not clear if I should fmt files as I touch them
  • realpath doesn't exist on macos. I had to brew install coreutils which might want to be documented, but could alternatively change scripts/compile-tests.sh to not rely on it

@alunyov
Copy link
Contributor

alunyov commented Nov 9, 2021

Thanks for updates to the PR @tomgasson and for contributors feedback!

I think we'll need to figure out some of these steps (as we're don't really know some of them): correct rust version, fmt, running tests, etc. I've a task for that to update the contributors guide to reflect some of these steps: #3630

@facebook-github-bot
Copy link
Contributor

@alunyov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

) -> String {
Printer::without_dedupe(js_module_format).print_operation(schema, operation)
Printer::without_dedupe(js_module_format).print_operation(schema, operation, enable_reduced_metadata)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add this flag to the Printer itself, so we do not pass it around all methods?

@@ -436,7 +437,7 @@ fn generate_split_operation(
&mut content,
"node",
"NormalizationSplitOperation",
&printer.print_operation(schema, normalization_operation),
&printer.print_operation(schema, normalization_operation, &project_config.feature_flags.enable_reduced_metadata),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can create an instance printer with the enable_reduced_metadata option, so we don't have to pass it around?

@alunyov
Copy link
Contributor

alunyov commented Dec 17, 2021

Hey @tomgasson sorry for delay with this. So, I could not land this internally, some tests were failing, and once I start looking into these we have a big refactoring with the intern package.

I think we can still add this PR, can you try rebasing it?

@tomgasson tomgasson force-pushed the minify branch 2 times, most recently from 287cfd9 to 9de3226 Compare February 1, 2022 01:57
@tomgasson
Copy link
Contributor Author

tomgasson commented Feb 1, 2022

Sorry for the delay on coming back to this. Had to complete our move to the new compiler first.

I've taken a slightly different approach now that a few things have changed like the config (and therefore flags) being easier to access.

  • Add a feature flag skip_printing_nulls
  • Added a Primitive::NullOrNothing type to indicate that the value should be removed rather than output null
  • Filter out NullOrNothing values from arrays and objects when printing and the feature is enabled (otherwise continues to output null as before)
  • Added a snapshot test of this feature
  • Updated flow types to allow for this change (all of the form field: ?Type => field?: ?Type)

@@ -176,9 +176,7 @@ return {
"params": {
"cacheID": "c13dedba1d0c8b17a192b735d141de99",
"id": null,
"metadata": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why this has changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts/compile-tests.sh was broken on main for me.

I added "enable_provided_variables": {"kind": "enabled"} to fix, and running it generated this change.

(maybe I was doing something wrong?)

Copy link
Contributor

@alunyov alunyov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the follow up.

Can we figure out what happened with test artifacts? Is it something we need to fix separately? I don't think changes to metadata in generated artifacts are expected here.

The fragment & query metadata contains many null values. This change
adds a feature flag (`skip_printing_nulls`) to allow the rust
compiler to avoid outputting those fields. The runtime will still
operate on them, due to it's pervasive use of loose equality for null /
undefined checks.
@tomgasson
Copy link
Contributor Author

Fixed nits, and removed the commit which allowed scripts/compile-tests.sh to work on master (I needed it to manually check this worked, but it's not related to this change)

@facebook-github-bot
Copy link
Contributor

@alunyov has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@josephsavona
Copy link
Contributor

@tomgasson Thank you for the contribution! I'm really excited that we can provide this as an option.

tomgasson added a commit to tomgasson/relay that referenced this pull request Feb 3, 2022
In `skip_printing_nulls` (facebook#3626) we change some `null` values to be undefined by skipping them in the output.

Generally, the js code uses loose equality, either as
 * `field.X != null`
 * `field.X ?? default`
 * `if (field.X) {`

which allows us to do this safely.

The comparison this commit fixes was a rare exception to that showing up when using `@required`, mocking data and `skip_printing_nulls` is enabled.
facebook-github-bot pushed a commit that referenced this pull request Feb 3, 2022
Summary:
In `skip_printing_nulls` (#3626) we change some `null` values to be undefined by skipping them in the output.

Generally, the js code uses loose equality, either as
 * `field.X != null`
 * `field.X ?? default`
 * `if (field.X) {`

which allows us to do this safely.

The comparison this commit fixes was a rare exception to that showing up when using `required`, mocking data and `skip_printing_nulls` is enabled.

Pull Request resolved: #3784

Reviewed By: captbaritone

Differential Revision: D33965847

Pulled By: alunyov

fbshipit-source-id: 98f414e4f90738fcb68c940d6ce7208734c49297
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants