Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve importRecursive performance with a regexp #755

Closed
wants to merge 1 commit into from

Conversation

julienduchesne
Copy link
Member

@julienduchesne julienduchesne commented Sep 13, 2022

Instead of parsing the whole AST trying to find imports, find them with a regexp. This is significantly faster for very nested projects
I added a benchmark in the tests. Here's the result:

$ benchstat bench1.txt bench2.txt
name                 old time/op  new time/op  delta
TransitiveImports-8  0.00ns ±47%  0.00ns ±30%  -25.85%  (p=0.000 n=92+90)

In real-life cases, I have gotten 10x faster run time (2.5s to 200ms) on some of our heavier projects (which are the ideal cases for caching, where this function is used)

Instead of parsing the whole structure trying to find imports, find them with a regexp. This is significantly faster for very nested projects
I added a benchmark in the tests. Here's the result:
```
$ benchstat bench1.txt bench2.txt
name                 old time/op  new time/op  delta
TransitiveImports-8  0.00ns ±47%  0.00ns ±30%  -25.85%  (p=0.000 n=92+90)
```

In real-life cases, I have gotten 10x faster run time (2.5s to 200ms) on some of our heavier projects (which are the ideal cases for caching)
@julienduchesne julienduchesne requested review from Duologic, sh0rez and a team September 13, 2022 14:10
@julienduchesne julienduchesne marked this pull request as ready for review September 13, 2022 14:11
@sh0rez
Copy link
Member

sh0rez commented Sep 13, 2022

without having looked into too much detail here, wouldn't this be vulnerable to false positives?

bad: 'import "foobarbaz"'

@julienduchesne
Copy link
Member Author

julienduchesne commented Sep 13, 2022

without having looked into too much detail here, wouldn't this be vulnerable to false positives?

bad: 'import "foobarbaz"'

definitely, but there two uses for this function:

  • TransitiveImports: This function will fail if any of the imports are not found. You could have a commented out import that does resolve though, this could indeed be annoying
  • getSnippetHash: Used for environment caching. For this one, the only important thing is to get a consistent hash that changes consistently when any of the imports changes. In that case, false positives aren't a big deal but speed is

I'm wondering if I should keep both strategies. Using the AST for TransitiveImports and a regexp for calculating the hash of an env. What do you think?

@julienduchesne julienduchesne marked this pull request as draft September 13, 2022 23:34
@julienduchesne
Copy link
Member Author

Converting to a draft. Will work on a different approach

@julienduchesne
Copy link
Member Author

Implemented in #759

@julienduchesne julienduchesne deleted the julienduchesne/improve-import-recursive branch September 19, 2022 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants