Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starlark: optimize StarlarkInt.Big comparison to StarlarkInt.Int{32,64} #12638

Closed
wants to merge 1 commit into from

Conversation

stepancheg
Copy link
Contributor

Perform comparison without conversion of smaller integers to
BigInteger. StarlarkInt.compareTo does not allocate now.

For this benchmark:

def test():
    x = 17 << 77
    for i in range(10):
        print(i)
        for j in range(10000000):
            x > 1

test()
A: n=27 mean=4.262 std=0.203 se=0.039 min=4.036 med=4.193
B: n=27 mean=4.113 std=0.172 se=0.033 min=3.859 med=4.049
B/A: 0.965 0.941..0.990 (95% conf)

Speed up is about 7% when comparing to an integer outside of
BigInteger cached range (-16..16).

Finally, StarlarkInt.Big to StarlarkInt.Big comparison performance
seems to stay the same (within 95% confidence interval after 100
test iterations).

@@ -446,7 +464,13 @@ public static int compare(StarlarkInt x, StarlarkInt y) {
/* fall through */
}

return x.toBigInteger().compareTo(y.toBigInteger());
int xo = x.orderOfMagnitude();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete LL457-459? (GitHub won't let me comment on that line.)

The invariant at this point (L467) is that at least one of the operands is Big. Can we do the comparison efficiently without adding the new virtual method, something like this:

public static int compare(StarlarkInt x, StarlarkInt y) {
  long xl, yl;
  boolean xbig, ybig;
  try {
    xl = x.toLongFast();
  } catch (Overflow) {
    xbig = true;
  }
  try {
    yl = y.toLongFast();
  } catch (Overflow) {
    ybig = true;
  }

  // If only one operand is big, its magnitude is greater than the other operand,
  // which can be ignored (treated as zero).
  if (xbig) {
    return ybig ?  ((Big) x).compareTo(y) : x.signum();
  } else {
    return ybig ?  -y.signum(), Long.compare(x, y);
  }
}

@stepancheg
Copy link
Contributor Author

Updated PR without orderOfMagnitude virtual method.

Removed special case x instanceof Int32 && y instanceof Int32, possible slowdown is very little (within half percent) for test 1 < 2.

// If neither argument is big integer, we compare longs.
// If only one argument is big integer, it is bigger than other if positive
// and smaller otherwise.
if (xbig && ybig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this version requires 4 branches to reach the common case (Long.compare), whereas the version I proposed used a control tree of depth two. Probably not a big deal, especially given the mystery and unpredictability of measurements using the JVM, but something to bear in mind when optimizing (especially in a language like C++ or Go or assembly).

Perform comparison without conversion of smaller integers to
`BigInteger`.  `StarlarkInt.compareTo` does not allocate now.

For this benchmark:

```
def test():
    x = 17 << 77
    for i in range(10):
        print(i)
        for j in range(10000000):
            x > 1

test()
```

```
A: n=27 mean=4.262 std=0.203 se=0.039 min=4.036 med=4.193
B: n=27 mean=4.113 std=0.172 se=0.033 min=3.859 med=4.049
B/A: 0.965 0.941..0.990 (95% conf)
```

Speed up is about 7% when comparing to an integer outside of
`BigInteger` cached range (-16..16).

Finally, `StarlarkInt.Big` to `StarlarkInt.Big` comparison performance
seems to stay the same (within 95% confidence interval after 100
test iterations).
@stepancheg
Copy link
Contributor Author

stepancheg commented Dec 6, 2020

Go full branchless on the typical case of long-long comparison.

1 < 2 test seems to perform slightly better than previous version (and even master), but so little, that it is impossible to say for sure without running test for several hours in different host configurations.

Let's say, performance is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants