Int64.parseInt doesn't check for overflow, nor do its docs explain what happens #39

trinarytree · 2018-05-18T01:25:33Z

Int64.parseInt('12343498230482340234820349823049823')

yields

-6445017949831143329

which isn't what I'd expect. I would expect this to throw.

Compare to Java:

Integer.parseInt('12343498230482340234820349823049823');

throws

java.lang.NumberFormatException: For input string: "12432904823048234092384023482309"

The class level docs for Int64 vaguely say,
"Arithmetic operations may overflow in order to maintain this range."
but I don't see how that should include parsing.
Overflow in ordinary arithmetic can be used for beneficial purposes, such as implementing an Int128 out of Int64, but what possible benefit is there to allowing overflow in parsing?

Some options:
(a) Lazy: Update the docs to at least say what happens since (imo) this isn't expected.
(b) More ambitious: Add an optional named param to parseInt like throwOnOverflow (default false).
It shouldn't be too expensive to detect; after all, Java does it. Maybe just test the carry at each step.
(c) Very ambitious: Forget about the optional param, just throw on overflow since it's hard to imagine a legit use case for allowing overflow. This is ambitious in the sense that it changes existing semantics.

The text was updated successfully, but these errors were encountered:

trinarytree · 2018-05-18T18:42:56Z

I believe this snippet added to _parseRadix would cheaply detect overflow:

if ((d2 & ~_MASK2 != 0) || (!negative && (d2 & _SIGN_BIT_MASK != 0))) {
  throw new FormatException('Overflow: $s');
}

Without such a thing, clients cannot as cheaply detect overflow. I only see 2 basic ways to do so:
(1) Canonicalize the input string to strip leading 0s and convert -0 to 0 and then
compare that canonicalized string to the toString'ed parsed Int64. e.g.

var i = Int64.parseInt(s);
if (i.toString() != canonicalizeIntString(s)) {
  throw FormatException('Overflow: $s');
}

This is a bit brittle because the canonicalizer could easily miss a few cases.
E.g. does +12 count as 12? What if Int64 adds new formats that it accepts, like accepting commas
or octal? This is also not very efficient since i.toString() involves many divisions et al complex operations.

(2) Reimplement most of _parseRadix, but add that snippet of code I proposed above.
This can't actually produce an Int64 because that would require access to private members, so you'll have to parse the string twice:

throwOnInt64ParsingOverflow(s);
return Int64.parseInt(s);

One could make this a little more efficient by converting the 3 vars d0, d1, d2 into bytes (or 32-bit ints) and calling fromBytes (or fromInts). But this is tantamount to forking the implementation. Clients who did this wouldn't benefit from updates to Int64.parseInt.

Would anyone be opposed to my adding the proposed snippet either using option (b) or (c)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Int64.parseInt doesn't check for overflow, nor do its docs explain what happens #39

Int64.parseInt doesn't check for overflow, nor do its docs explain what happens #39

trinarytree commented May 18, 2018

trinarytree commented May 18, 2018

Int64.parseInt doesn't check for overflow, nor do its docs explain what happens #39

Int64.parseInt doesn't check for overflow, nor do its docs explain what happens #39

Comments

trinarytree commented May 18, 2018

trinarytree commented May 18, 2018