Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mypyc] Precompute set literals for "in" ops against / iteration over set literals #14409

Merged
merged 17 commits into from
Jan 10, 2023

Conversation

ichard26
Copy link
Collaborator

@ichard26 ichard26 commented Jan 7, 2023

Towards mypyc/mypyc#726. (There's a Python compatibility bug that needs to be fixed before the issue can be closed.)

For example, the set literals here are now precomputed as frozensets at module initialization.

x in {1, 2.0, "3"}
x not in {1, 2.0, "3"}

for _ in {1, 2.0, "3"}:
    ...

Set literal items supported:

  • Anything supported by irbuild.constant_fold.constant_fold_expr()
    • String and integer literals
    • Final references to int/str values
    • Certain int and str unary/binary ops that evaluate to a constant value
  • None, True, and False
  • Float, byte, and complex literals
  • Tuple literals with only items listed above

Results (using gcc-9 on 64-bit Ubuntu)

Master @ 98cc165

running in_set
..........
interpreted: 0.495790s (avg of 5 iterations; stdev 6.8%)
compiled:    0.810029s (avg of 5 iterations; stdev 1.5%)

compiled is 0.612x faster

running set_literal_iteration
.........................................................................................
interpreted: 0.020255s (avg of 45 iterations; stdev 2.5%)
compiled:    0.016336s (avg of 45 iterations; stdev 1.8%)

compiled is 1.240x faster

This PR

running in_set
..........
interpreted: 0.502020s (avg of 5 iterations; stdev 1.1%)
compiled:    0.390281s (avg of 5 iterations; stdev 6.2%)

compiled is 1.286x faster

running set_literal_iteration
..............................................................................................
interpreted: 0.019917s (avg of 47 iterations; stdev 2.2%)
compiled:    0.007134s (avg of 47 iterations; stdev 2.6%)

compiled is 2.792x faster

Benchmarks can be found here: mypyc/mypyc-benchmarks#32

ichard26 and others added 11 commits August 23, 2022 18:31
constant_fold_expr() replaces my custom logic for string/integer
literals and final references. It even supports folding constant
expressions which means even {1 + 2} could be precomputed!

This means final references to bool, float, bytes*, and complex values
are no longer supported, but that's okay for now. In a later patch, we
can add proper support for (some of) these types to
constant_fold_expr().

*Also bytes are just in general really annoying to work with (because
the true bytes value isn't stored anywhere, you have to convert the
string representation to bytes instead... ugh) so I don't really care
about final refs to bytes values/literals right now. Byte literals IN
the sets are still supported.
@ichard26 ichard26 changed the title Precompute set literals for "in" ops against / iteration over set literals [mypyc] Precompute set literals for "in" ops against / iteration over set literals Jan 7, 2023
@ichard26
Copy link
Collaborator Author

ichard26 commented Jan 7, 2023

FYI, I wrote the PR description to be used as the commit message when this is squash merged into main. Please copy and paste it and save yourself some work! :)

@ichard26 ichard26 marked this pull request as draft January 7, 2023 20:35
@ichard26
Copy link
Collaborator Author

ichard26 commented Jan 7, 2023

I'll push a functional work around to the mypyc bug that's breaking the build sometime later, but for those wondering "do people really loop over set literals often?" here's a GitHub search query for you (there's even an example in Black's source code 🙂).

I got tired of trying to compile mypy locally (15 minutes and it's still
not done), let's see what CI has to say.
@ichard26 ichard26 marked this pull request as ready for review January 7, 2023 23:49
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just a few minor comments. I like that this fixes some important cases where mypyc can generate code slower than CPython.

mypyc/lib-rt/misc_ops.c Outdated Show resolved Hide resolved
for i in {None, False, 1, 2.0, "3", b"4", 5j, (6,), CONST}:
s.add(i)

[file driver.py]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Instead of having the assertions in driver.py, could you add them into the main [case ...] section inside test_* functions?

I'd like to move towards writing tests this way, since it makes larger test cases easier to read, and there's less risk that contributors forget that code in driver.py doesn't get compiled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, PTAL 9cdfc98.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants