Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support Postgres syntax names for all types #14493

Closed
jordanlewis opened this issue Mar 30, 2017 · 10 comments
Closed

sql: support Postgres syntax names for all types #14493

jordanlewis opened this issue Mar 30, 2017 · 10 comments
Assignees
Milestone

Comments

@jordanlewis
Copy link
Member

We should consider supporting the Postgres syntax for all of the types whose Postgres OID we support via pgwire and claim to support via pg_catalog.pg_type.

For instance, our integer type is named INT, but we support the int2, int4, and int8 types via their OID in pgwire and in pg_type. We should consider supporting using the type names INT2, INT4, and INT8 as syntax in SQL statements. They would just be aliases for our INT type, the same way their OIDs are, since we don't support storing integers smaller than int8.

@jordanlewis jordanlewis added this to the Later milestone Mar 30, 2017
@bdarnell
Copy link
Contributor

int4 is already tracked in #12481

@jordanlewis
Copy link
Member Author

There are a few questions here. I have opinions on some.

First, do we care to implement right-sized backing encodings for smaller datatypes, or is backing these datatypes by full-sized ones sufficient? I think that for now the latter is okay.

Second, if we choose to not have different backing encodings for these types, do we care to preserve the user's intention when creating a table, or is it okay to just make them aliases at the parser level? In other words, if a user creates a table with an int4 column, do we need to remember that fact or is it okay to let the column appear for all intents and purposes as an ordinary int column? I think for the purposes of compatibility we do in fact need to preserve the intended types. See #16491 for an example of that - if Hibernate creates a table with int4 and we return an int8 for example, it may end up using the wrong Java type to represent values in that column -which will break user assumptions and programs.

Third, if we are going to preserve the user-intended types (as I think we have to), do we need to have separate type tags (ColumnType.Kind) for these types, or should we introduce some kind of aliasing mechanism at the ColumnType proto level? I don't know the answer to this - either would work okay. The downside to having separate type tags is that it's ugly (in my opinion) to have a lot of repeated, similar values there. The upside is that it might make the code cleaner and provide an easier migration path if we do decide to eventually store these values right-sized.

Fourth, are we going to truncate inputs to smaller datatypes? I think we probably should.

@justinj
Copy link
Contributor

justinj commented Jun 14, 2017

I agree on 1-3, if I understand what you mean for 4 it's also an option to just error out if a value is out of range. It's what Postgres does:

# create table b (a int4);
# insert into b values (100000000000);
ERROR:  integer out of range

I suspect if someone cares to explicitly use int4 (whether "someone" is a user or an ORM) they probably want to be alerted if the values they're inserting don't match that specification.

@jordanlewis
Copy link
Member Author

Oh, yeah - that makes a lot more sense. Forget truncation - let's just error! 👍

@m-schneider
Copy link
Contributor

I agree on 1-3 too and error is a great idea!

If we choose to go this way vs making them aliases, what would be the logic for mathematical operations on them? If int4*int4 can't fit into an int4, do we error or return the value?

@justinj
Copy link
Contributor

justinj commented Jun 14, 2017

Good question, FWIW, Postgres errors:

# select 1000000000::int4 * 10::int4;
ERROR:  integer out of range

But I think either sounds fine to me? Since we'd still error if we try to actually tried to insert it into a column.

@jordanlewis
Copy link
Member Author

Good question. There's another decision we have to make about this project that I didn't mention about: are we going to represent all of these types differently at the Datum level as well?

Currently, we only have one kind of integer datum: DInt. I was assuming that we wouldn't add any new datum types for this project - this would mean that the in-memory representation of integers (or floats, same goes there) of any kind would not change. Validations would be performed at the edges of the system - upon user input, at the point of marshalling into bytes before we store, and upon output to the user as well.

I think that adding new datum types for all of these will be a pretty big pain - I advise against it. The reason that it'll be painful has to do with the way we've implemented comparison and binary operators - check out eval.go for that code. We'd either need to add implementations for all of the different cross-type comparisons and operators we'd need, which is unideal, or totally change how that implementation works.

Instead, I think that ignoring potential smaller-type overflows during math operations is the way to go. It will produce different behavior from Postgres in certain cases, but I think it's an acceptable tradeoff. For example, the following errors in Postgres, but wouldn't error in our implementation:

test=# select (9000::int2 * 9000::int2 / 9000::int2)::int2;
ERROR:  22003: smallint out of range
LOCATION:  int2mul, int.c:845

@m-schneider m-schneider self-assigned this Jun 14, 2017
@m-schneider
Copy link
Contributor

I agree that adding new datums or wrapping DInt into something like OidWrapper might not be worth it for math operations. I was however planning on enforcing the types on Insert and Update because we do have the column descriptors at that stage.

@knz what are your thoughts?

@knz
Copy link
Contributor

knz commented Jun 23, 2017

I think I am seeing wisdom in TiDB's approach, where all the datums are implemented via the same struct. In that struct there is a SQL type field and then next to this multiple fields of each Go type, with only one that carry the value per SQL type.
The advantages are numerous:

  • just one struct, it makes it easy(ier) to add new types later
  • we can support precision/width for any type where it makes sense without too much code duplication
  • much better allocation (and reuse) story

How would this make you feel? (I suggest you have a brief look at the TiDB source code to see what I mean)

@knz
Copy link
Contributor

knz commented Jun 23, 2017

(incidentally, this would be walking a path towards a common endpoint with the IR work David was undertaking)

m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 26, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 27, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 27, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 28, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 28, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 28, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 28, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 29, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 29, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 30, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
m-schneider pushed a commit to m-schneider/cockroach that referenced this issue Jun 30, 2017
with postgres support of numerical types.

Before this change we didn't support any of the types mentioned above.

After this change we'll support those types as alias and we will check
for overflow on INSERT and UPDATE. We currently won't support type checking
on mathematical operations, inline with how decimals are currently implemented.
Support for those operations would require a change to Datums, which is outside
of the scope for this change.

Closes cockroachdb#12481
Closes cockroachdb#14493
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants