Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Add Timezone data generator as build task #39913

Merged
merged 10 commits into from
Aug 15, 2020
Merged

Conversation

tqiu8
Copy link
Contributor

@tqiu8 tqiu8 commented Jul 24, 2020

No description provided.

src/mono/wasm/wasm.targets Outdated Show resolved Hide resolved
src/mono/wasm/wasm.targets Outdated Show resolved Hide resolved
src/mono/wasm/wasm.targets Outdated Show resolved Hide resolved

private (byte[] json_bytes, MemoryStream stream) readTimeZone (string folder) {
// https://en.wikipedia.org/wiki/Tz_database#Area
var areas = new[] { "Africa", "America", "Antarctica", "Arctic", "Asia", "Atlantic", "Australia", "Europe", "Indian", "Pacific", "zone1970.tab"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these areas be accepted as an input by the task so that in the future is controllable from outside of it via a MSBuild item?

Copy link
Member

@lewing lewing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start. Would it make sense to make two tasks? One that only made a bundle and that handled the timezone specific parts and to use as input to the bundle? I do think it would be useful expose the bundling logic for a general set of targets.

Comment on lines 32 to 35
string[] systemtz = { "America/Los_Angeles", "Australia/Sydney", "Europe/London", "Pacific/Tongatapu",
"America/Sao_Paulo", "Australia/Perth", "Africa/Nairobi", "Europe/Berlin",
"Europe/Moscow", "Africa/Tripoli", "America/Argentina/Catamarca", "Europe/Lisbon",
"America/St_Johns"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... what's the reason for including only these zones from this file? If it's for testing data, should we instead just have a static bundle file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just the timezones that will be cached for SystemTimeZones. The original file was 19kb so we filtered it down to those timezones.

private void DownloadTimeZoneData (string input_folder, string output_folder) {
using (var client = new WebClient())
{
client.DownloadFile("https://data.iana.org/time-zones/tzdata-latest.tar.gz", $"{input_folder}/tzdata.tar.gz");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file you downloaded includes a Makefile. I haven't investigated it, so this might be a dead end, but if it has an install target, does that make things easier? That is, does it just put all the zones into a final file, so you don't have to manually do a list of the files you want, or the areas to extract?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot depend on latest data automatically. It's be better to move the data download outside of this task and set the version explicitly there

@tqiu8
Copy link
Contributor Author

tqiu8 commented Jul 27, 2020

This is a good start. Would it make sense to make two tasks? One that only made a bundle and that handled the timezone specific parts and to use as input to the bundle? I do think it would be useful expose the bundling logic for a general set of targets.

Yes, I can separate this task into a TimeZone specific one that handles the downloading, compiling, and reordering of data; and another task that handles general input and bundles into a data file.

Copy link
Contributor

@marek-safar marek-safar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Copy link
Member

@lewing lewing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! The bundler split looks good. For the versioning I think we can make things a little simpler still. The timezone data exists at https://data.iana.org/time-zones/tzdb<version>/ already unpacked so I think it makes sense to make the TimeZoneLoader task take a version property like 2020a then instead of downloading the zip you can download the individual files from https://data.iana.org/time-zones/tzdb2020a/<path> and build the bundle from that.

using Microsoft.Build.Framework;
using Microsoft.Build.Utilities;

public class TimeZoneLoader : Task
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this to something like DownloadTimeZoneData. That is generally how msbuild tasks are named.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the other build tasks are named liked WasmAppBuilder or PInvokeTableGenerator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mobile.tasks do seem to be named like that, but not others like in installer.tasks. IMHO, the task names should read like actions, that get executed in a target. Not a blocker though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a new task that we are creating, we can follow the msbuild conventions. The idea here is that a msbuild Target executes a bunch of Tasks, which are actions/code, so, you would name the Tasks like a method.
Let's change the names for the new tasks to match that.

Comment on lines 17 to 18
[Required]
public string? InputDirectory { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this parameter needed at all? IIUC, this just seems to be the directory where files are temporarily downloaded. Do they get used after this task is done? If not, then I would suggest removing this, and instead creating a temporary directory, and downloading to that. And removing it at the end of the task.

@tqiu8 tqiu8 requested a review from lewing August 12, 2020 17:19
Copy link
Member

@lewing lewing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks almost ready. The remaining issues are mostly around naming (I do think OutputFileName is clearer).

Copy link
Member

@radical radical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Good job, and thanks for being patient with the review feedback!

Copy link
Contributor

@marek-safar marek-safar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more comments, otherwise LGTM

@@ -35,10 +35,28 @@
OutputPath="$(WasmPInvokeTablePath)" />
</Target>

<UsingTask TaskName="DownloadTimeZoneData" AssemblyFile="$([MSBuild]::NormalizePath('$(ArtifactsBinDir)', 'CreateWasmBundle', 'Debug', '$(NetCoreAppCurrent)', 'publish', 'CreateWasmBundle.dll'))"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it Debug shouldn't this be $(Configuration)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm I changed this to $(Configuration) and it throws:

/home/taqiu/runtime/src/mono/wasm/wasm.targets(40,5): error MSB4062: The "DownloadTimeZoneData" task could not 
be loaded from the assembly 
/home/taqiu/runtime/artifacts/bin/CreateWasmBundle/Release/net5.0/publish/CreateWasmBundle.dll. Could not load file 
or assembly '/home/taqiu/runtime/artifacts/bin/CreateWasmBundle/Release/net5.0/publish/CreateWasmBundle.dll'. The 
system cannot find the file specified. [/home/taqiu/runtime/src/libraries/src.proj]                            
/home/taqiu/runtime/src/mono/wasm/wasm.targets(40,5): error MSB4062:  Confirm that the <UsingTask> declaration is 
correct, that the assembly and all its dependencies are available, and that the task contains a public class that implements 
Microsoft.Build.Framework.ITask. [/home/taqiu/runtime/src/libraries/src.proj]  

I was following the <UsingTask TaskName="PInvokeTableGenerator" AssemblyFile="$([MSBuild]::NormalizePath('$(ArtifactsBinDir)', 'WasmAppBuilder', 'Debug', '$(NetCoreAppCurrent)', 'publish', 'WasmAppBuilder.dll'))"/>

Version="2020a" />
</Target>

<UsingTask TaskName="CreateWasmBundle" AssemblyFile="$([MSBuild]::NormalizePath('$(ArtifactsBinDir)', 'CreateWasmBundle', 'Debug', '$(NetCoreAppCurrent)', 'publish', 'CreateWasmBundle.dll'))"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

return false;
}

data = EnumerateData();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data = EnumerateData();
(byte[] json_bytes, MemoryStream stream) data = EnumerateData();

{
foreach (var file in files)
{
client.DownloadFile($"https://data.iana.org/time-zones/tzdb-{Version}/{file}", $"{Path.Combine(InputDirectory!, file)}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to print what it's doing in case the download takes long time

File.Copy(Path.Combine(InputDirectory!,"zone1970.tab"), Path.Combine(OutputDirectory!,"zone1970.tab"));
}

private void FilterTimeZoneData()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? You process additional files and then they are deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just to remove extraneous files so that CreateWasmBundle doesn't end up including them in the data archive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method removes unnecessary files so that the subsequent CreateWasmBundle does not include them in the data archive.

@tqiu8 tqiu8 merged commit d2601e3 into dotnet:master Aug 15, 2020
List<string> files = new List<string>() {"africa", "antarctica", "asia", "australasia", "etcetera", "europe", "northamerica", "southamerica", "zone1970.tab"};
using (var client = new WebClient())
{
Console.WriteLine("Downloading TimeZone data files");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqiu8 Use Log.LogMessageFromText ("Downloading TimeZone data files", MessageImportance.Low); instead of CWL. This way, the message would end up in the logs correctly. Could you fix this as a follow up?

@karelz karelz added this to the 5.0.0 milestone Aug 18, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants