Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary updates #2869

Closed
MangelMaxime opened this issue Apr 28, 2022 · 28 comments · Fixed by #2941 or #2945
Closed

Avoid unnecessary updates #2869

MangelMaxime opened this issue Apr 28, 2022 · 28 comments · Fixed by #2941 or #2945

Comments

@MangelMaxime
Copy link
Member

MangelMaxime commented Apr 28, 2022

UPDATE: This is a good first issue for contributors as it doesn't require specific knowledge about Fable code. See the steps necessary to fix in this comment below.

This is a behaviour I detected after reporting #2868

Description

Let say that in my project I have the following files:

  • CssModules.fs
  • App.fs
  • Login.fs
  • Navbar.fs

and CssModules.fs is referenced in all the other files.

When CssModules.fs change Fable is recompiling all the files that depends on it and rewrite them on the disk even if nothing changed in the output of each of these files.

This cause a lot unnecessary "hot update" to be triggered by the bundlers. I think Fable should only write a file on the disk if the output of the compilation for that file changed since the last compilation.

Related information

  • Fable version: 3.7.5
  • Operating system: Windows
@alfonsogarciacaro
Copy link
Member

Makes sense. Funny enough, I had similar issues when working with hot update for Fable.Lit but tried to fix it in a more complicated way using the triggeredByDependency flag.

However, we need to be a bit careful with this to avoid performance hits. The Fable printer is designed to write directly into the file stream so I assume we need to:

  • Replace the file stream with a memory stream
  • Write a function that uses a buffer to compare the memory stream with the old file stream without having to fully load it in memory. I was looking for ways to do this and found this nice reply from Tomas Petricek, although unfortunately there's no code 😅 https://stackoverflow.com/a/2978737
  • This is only needed for watch compilations when there's already an existing file. In other cases we should write directly to the file as we do it now.

@ncave
Copy link
Collaborator

ncave commented Apr 29, 2022

@alfonsogarciacaro

Replace the file stream with a memory stream

+1

Write a function that uses a buffer to compare the memory stream with the old file stream without having to fully load it in memory.

IMO, as pointed in the link:

  1. handle zero-length memory streams.
  2. else, check if file exists.
  3. else, compare memory stream length and file length (before file loading).
  4. else (optional) compare memory stream hashcode with a file signature (previously stored when generating the file).
  5. else, compare memory and file streams buffer by buffer.
  6. If not the same, update the file (and optionally the file signature).

This is only needed for watch compilations when there's already an existing file. In other cases we should write directly to the file as we do it now.

IMO it would be easier to always use memory stream to simplify the file handling logic, instead of forking it.
We can just skip some of the steps above if there is no need to.

@alfonsogarciacaro
Copy link
Member

This makes sense @ncave. As you say, if we can do something optimized enough, it's probably better to replace the current method instead of having two ways of writing the file. Comparing the bytes length should resolve equality in 99% of cases. Will the file size correspond exactly with the byte length for the memory stream (not sure if there's any extra byte saved for the encoding depending on the system)? I was also thinking that it was better for parallelization to write asynchronously and directly to the file stream, but maybe it's more efficient to use a memory stream and flush the whole contents to file at once.

I'm currently focused on preparing a prerelease of snake_island, so it'd be great if someone else could give it a go at this 😸

@alfonsogarciacaro
Copy link
Member

In any case @MangelMaxime, do you think it would be possible to use the triggeredByDependency flag as in here. There are situations when the code will still change (e.g. when changing the signature of a member which updates the overload suffix) but you need to tell the HMR plugin to skip the hot update.

@MangelMaxime
Copy link
Member Author

@alfonsogarciacaro

In my case, this is not possible to use the triggeredByDependency flag because we are not writting any specific code to support HMR.

Indeed, in the project, that I am working on we decided to use Fable because we like to use F# to write the code but we made sure that the generated JavaScript would be the "same" as the one that we would write in JavaScript.

This has a huge benefit which is any JavaScript tools that exist, we can use it in our project without any hacks or bad surprise.

So for example, in the case of HMR with Vite and React-refresh, we do nothing to make it work. We just write our code with the same convention as the JavaScript one:

  • Hooks should start with use prefix
  • React component should start with a capital letter
  • Nothing else than ReactComponent can be exposed from a module/file which has ReactComponent in it
  • etc.

Personally, I feel like triggeredByDependency is a workaround / solution to a problem that is caused by Fable itself. So it would be cleaner / more elegant to have the root of the problem fixed instead of a workaround.

In theory, when writing JavaScript manually there is only one way to have mutiple save occuring at the same time which is if use your IDE feature "Save all", but I think in general people use Ctrl+S / Save (current file).

Personnally, before digging into this issue I was thinking that if I edit FileA.fs, Fable was only going to write FileA.fs.js.

Even, if the gain of performance is not huge this should still increase the performance or at least the feedback loop because writing to the disk is "slow", but also because then HMR would just need to update one file and not detecting and handling all the changes in browser etc.

@alfonsogarciacaro
Copy link
Member

As I'm still busy with Fable 4, it'd be great if some contributor could handle this. It's a good first issue because it doesn't require specific knowledge about Fable, only how to use streams in .NET. Basically we need to do two things:

  1. Replace the File stream with a MemoryStream here:
    let stream = new IO.StreamWriter(targetPath)
  2. Then, after running the printer, in this point we need to what @ncave listed above (I've edited the list to skip the signature for now for simplicity and because it seems that file length doesn't always correspond with text length):
    a. if memory stream is zero-length, skip.
    b. if file doesn't exist, write memory stream to disk
    c. if file exists, load it with stream (not ReadAllText) and compare memory and file streams buffer by buffer (maybe Span can help here)
    d. as soon as a difference is detected, update the file. If streams are the same, skip.

@hensou
Copy link
Contributor

hensou commented Jun 14, 2022

@alfonsogarciacaro, if nobody is working on this I would like to do it as my first issue.

In fact, I was already trying out yesterday and here is what I have so far:

  1. I've replaced the StreamWriter for MemoryStream
    a. At the _.Write(str) method, since MemoryStream write as bytes, I get the bytes of str with UTF8.GetBytes(str) and pass it on to WriteAsync (is that the expected way to do it?)

  2. Now for this part, I have encountered the following challenge:
    a. The stream (MemoryStream) is not visible at the compileFile method, should I exposed it as a member? I'm not sure this is a good solution.
    b. Even if we expose it as a member, after BabelPrinter.run executes, it effectively disposes the MemoryStream making it impossible to count the bytes and compare it at this point.

So that's what I have at the moment, could you give me some guidance on how to properly progress on the second step?

Also, feel free to tell me if you feel this might be to advanced for me at the moment.

@alfonsogarciacaro
Copy link
Member

Sorry for the late reply and thanks a lot for your help @hensou! If you open a PR it may be easier to review your current work as we can have a look at the code.

At the _.Write(str) method, since MemoryStream write as bytes, I get the bytes of str with UTF8.GetBytes(str) and pass it on to WriteAsync (is that the expected way to do it?)

Yes, that sounds about right👍

The stream (MemoryStream) is not visible at the compileFile method, should I exposed it as a member? I'm not sure this is a good solution. Even if we expose it as a member, after BabelPrinter.run executes, it effectively disposes the MemoryStream making it impossible to count the bytes and compare it at this point.

You can expose the MemoryStream from FileWriter or just a method like CompareWithFile or similar. And you're right, the stream shouldn't be disposed too early. If we remove the nested async the use statement will dispose it at the end of the compileFile method.

@hensou
Copy link
Contributor

hensou commented Jun 22, 2022

Thanks @alfonsogarciacaro, I'll work on that and submit a PR for review.

@inosik
Copy link
Contributor

inosik commented Jun 22, 2022

I've replaced the StreamWriter for MemoryStream

You could keep the StreamWriter but initialize it with a MemoryStream, instead of letting it open the FileStream internally. The StreamWriter would then take care of converting strings to bytes.

@hensou
Copy link
Contributor

hensou commented Jun 22, 2022

@inosik indeed, I think that would be a better thing to do. I'll go for that.

hensou added a commit to hensou/Fable that referenced this issue Jun 28, 2022
- Keep output in memory
- Only write to file if there is a change or file is new

Fix fable-compiler#2869
@hensou
Copy link
Contributor

hensou commented Jun 28, 2022

@alfonsogarciacaro, I think with PR #2941 I've managed to fix this issue.
Could you, or any other maintainer, review it?

alfonsogarciacaro pushed a commit that referenced this issue Jun 29, 2022
- Keep output in memory
- Only write to file if there is a change or file is new

Fix #2869
@alfonsogarciacaro
Copy link
Member

Fantastic work @hensou, thanks a lot for your help and welcome to the contributor's team! 👏 🏅 👏

@MangelMaxime This has been released in 3.7.15, please give it a try when you have a moment, and let us know if it's working for you.

@kerams
Copy link
Contributor

kerams commented Jun 29, 2022

Many thanks, confirming this greatly reduces the amount of updates (and screen flickering because of theme refreshes) editing a common files produces.

However, I still do get an extra update or two. Turns out it's because those files really are different, but the only change is:

vivaldi_c1M8fFnU7Y

@alfonsogarciacaro, I'm assuming these labels are non-deterministic because of parallel processing?

@alfonsogarciacaro
Copy link
Member

Thanks a lot for checking @kerams! Good to know it's (mostly) working for you. Interesting about activePatternResult, didn't realize that. I think this comes from the F# compiler because when Fable adds a numeric suffix to deduplicate identifiers it puts an underscore _ in between (e.g. matchValue_1) and Fable also resets the count for each file, so in principle it should be deterministic even with parallel processing.

It looks like the F# compiler has a counter for name deduplication that is not reset for recompilations, thus giving higher values each time. It's strange because other auto-generated identifiers like matchValue or patternInput are not de-duplicated.

@alfonsogarciacaro
Copy link
Member

Yes, it seems to be this: https://github.com/dotnet/fsharp/blob/d6aaefba78df55514178a5f49c44b44595a8f3ef/src/Compiler/Checking/PatternMatchCompilation.fs#L1279

I can think of a couple of solutions:

  1. Ask the F# team to provide a way to reset the counter for unique names and hope this gives us deterministic values (FCS type checking is not parallelized)
  2. On Fable side, detect when a compiler generated identifier ends with digits (I think we'll have a similar issue with automatically generated generic names if we want to output type annotations) and replace the digits with a deterministic number.

@hensou
Copy link
Contributor

hensou commented Jun 29, 2022

Fantastic work @hensou, thanks a lot for your help and welcome to the contributor's team! 👏 🏅 👏

@MangelMaxime This has been released in 3.7.15, please give it a try when you have a moment, and let us know if it's working for you.

Thanks @alfonsogarciacaro and everyone, I'm looking forward to contributing more 😄

@MangelMaxime
Copy link
Member Author

MangelMaxime commented Jun 29, 2022

@alfonsogarciacaro

For me Fable 3.7.15, still updates the files that depends on it.

Could it be related to the fact that CssModules.fs is a file that is used as a binding?

It contains stuff like that

//------------------------------------------------------------------------------
//        This code was generated by fable-css-modules.
//        Changes to this file will be lost when the code is regenerated.
//------------------------------------------------------------------------------

[<RequireQualifiedAccess>]
module internal CssModules

open Fable.Core

module Components =

    type Consent =

        /// <summary>
        /// Binding for <c>page-content</c> class
        /// </summary>
        [<Emit("$0[\"page-content\"]")>]
        abstract pageContent : string

        /// <summary>
        /// Binding for <c>label</c> class
        /// </summary>
        [<Emit("$0[\"label\"]")>]
        abstract label : string

Meaning that each file contains a line like:

let private classes : CssModules.Components.Consent = import "default" "./Consent.module.scss"

In order to access the binding definition.

I will look to make a reproduction repo

@MangelMaxime MangelMaxime reopened this Jun 29, 2022
@MangelMaxime
Copy link
Member Author

I tried to update a "normal" file and in that case only that files seems to be rewritten.

I don't remember if this was the case before or not.

@alfonsogarciacaro
Copy link
Member

@MangelMaxime Do you mean CssModules.fs is a dependent file and it's being updated (the last modified timestamp changes) even if there are no changes? Can you check if there's been an actual change like the name of a generated variable as in the case reported by @kerams?

There are several rules for how Fable builds the watch dependency tree. One way to test it is:

  1. File A declares a union type or a type with (potentially) overloaded instance/static members
  2. File B does pattern matching on the union type and/or call the overloaded instance/static member declared in File A
  3. Fable considers File B a watch dependency of File A because if the union tag (e.g. the order of union cases changes) or the signature of the overloaded method changes in File A, Fable needs to update File B too.
  4. Make a change in File A, Fable <=3.7.14 should update File B, Fable 3.7.15 should not.

@MangelMaxime
Copy link
Member Author

I had a hard time making a reproduction because when scaffolding a project from scratch I was not able to reproduce.

So I trimmed a lot of stuff from a production application and it seems to be reproducible even outside of my computer (tested on Gitpod).

If you look at the following picture, you can see that Consent.fs timestamp is updated and that Vite triggered some HMR call for this file.

image

The generated file Consent.fs.js doesn't seems to have any mangling or unique counter generated in it:

import Consent$002Emodule from "./Consent.module.css";
import { createElement } from "react";
import { createObj } from "../fable_modules/fable-library.3.7.15/Util.js";
import { Interop_reactApi } from "../fable_modules/Feliz.1.64.0/./Interop.fs.js";
import { ofArray } from "../fable_modules/fable-library.3.7.15/List.js";

import React from "react";

const classes = Consent$002Emodule;

export function Consent() {
    let elems;
    return createElement("div", createObj(ofArray([["className", classes["page-content"]], (elems = [createElement("div", {
        children: ["Consent component"],
    })], ["children", Interop_reactApi.Children.toArray(Array.from(elems))])])));
}

//# sourceMappingURL=Consent.fs.js.map

The generated Consent.fs.js.map also seems to be the same between generation. I explained in the README how to reproduce the bug but here is a video too trying to show it.

fable_repro_bug.mp4

You can find a reproduction repository here: https://github.com/MangelMaxime/fable-2869-reproduction

@hensou
Copy link
Contributor

hensou commented Jun 30, 2022

Thanks @MangelMaxime for the reproduction repo.

I can further investigate this later today, but I think this line (248) might be the issue.

Fable/src/Fable.Cli/Main.fs

Lines 241 to 251 in e54f2b1

use writer = new FileWriter(com.CurrentFile, outPath, cliArgs, pathResolver)
do! BabelPrinter.run writer babel
let! written = writer.WriteToFileIfChangedAsync()
if written && cliArgs.SourceMaps then
let mapPath = outPath + ".map"
do! IO.File.AppendAllLinesAsync(outPath, [$"//# sourceMappingURL={IO.Path.GetFileName(mapPath)}"]) |> Async.AwaitTask
use fs = IO.File.Open(mapPath, IO.FileMode.Create)
do! writer.SourceMap.SerializeAsync(fs) |> Async.AwaitTask

If we always append the sourceMap info after we already saved to the file, then when we compare memory stream with the file, they will always be different since the memory stream does not have // sourceMapping....

Does it make sense?

@MangelMaxime
Copy link
Member Author

@hensou I think you are onto something indeed.

Because, when scaffolding the reproduction from a template of mine I don't have the source maps enable.

@hensou
Copy link
Contributor

hensou commented Jun 30, 2022

@MangelMaxime yep, we can isolate the problem even more as if we remove the dependency from CssModules.fs, we can see that Consent.fs.js always write to disk.

I've made some changes to write the //# sourceMappingURL=... line to the memory stream as well and it stopped writing to disk as expected. I'll submit it for review.

@alfonsogarciacaro
Copy link
Member

I've pushed 3.7.16, this includes @hensou fix to prevent updating files with source maps when not necessary and also a fix to make compiler-generated variable names deterministic (basically by removing the numeric suffix added by FCS and default to Fable's deduplicate mechanism).

@MangelMaxime @kerams Could you please give it a try when you have a moment and confirm if this fixed your issues? Thanks a lot!

@kerams
Copy link
Contributor

kerams commented Jul 5, 2022

Awesome, the numbers are removed and so are the extra updates, thanks.

Just curious, is it possible this could cause regressions in the future?

@alfonsogarciacaro
Copy link
Member

Awesome, thanks for confirming @kerams! Well, you never know, but I assume this won't be a problem, because FCS was inconsistent with this numeric suffixes (activePatternResult has it but not other compiler generated variables like matchValue) and Fable has its own mechanism for name deduplication which guarantees that every variable within the same method has a unique name (including shadowed values) and it has been working well so far.

@MangelMaxime
Copy link
Member Author

Thank you all, I confirm that I can no longer reproduce the problem with my last reproduction code. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants