Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support postgres types transparently in queries #6817

Open
osawyerr opened this issue Jul 1, 2023 · 5 comments
Open

Support postgres types transparently in queries #6817

osawyerr opened this issue Jul 1, 2023 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@osawyerr
Copy link

osawyerr commented Jul 1, 2023

Is your feature request related to a problem or challenge?

DataFusion doesn't support native postgres types. I just realised that postgres queries that use postgres types, like the ones below, fail when executed against DataFusion. For example, the query below fails because the int8 isn’t recognised.

select sum(col1)::int8 from foo_table;

Describe the solution you'd like

Would be useful if type alisases were automatically supported without the user having to re-write postgres queries - e.g. int8 -> bigint and int4 -> int, etc.

Describe alternatives you've considered

Other databases support type aliases, I don't know how easy it is to implement that in DataFusion - https://duckdb.org/docs/sql/data_types/overview

Additional context

No response

@osawyerr osawyerr added the enhancement New feature or request label Jul 1, 2023
@osawyerr osawyerr changed the title Support for postgres types in queries Support for postgres types transparently in queries Jul 1, 2023
@osawyerr osawyerr changed the title Support for postgres types transparently in queries Support postgres types transparently in queries Jul 1, 2023
@alamb
Copy link
Contributor

alamb commented Jul 16, 2023

I think this would be very straightforward as long as the postgres type had a clear mapping to an Arrow type. For example int8 maps seemingly nicely to DataType::Int8

The mapping of SQL type to ArrowType is done here:

https://github.com/apache/arrow-datafusion/blob/49583bd5010282ca126e75100dce958aa346e5ee/datafusion/sql/src/planner.rs#L308-L400

cDataFusion CLI v27.0.0
❯ create table foo(x int8);
This feature is not implemented: Unsupported SQL type Custom(ObjectName([Ident { value: "int8", quote_style: None }]), [])

So we would just have to encode a listing of custom type names to arrow

@alamb alamb added the good first issue Good for newcomers label Jul 16, 2023
@alamb
Copy link
Contributor

alamb commented Jul 16, 2023

I think this is a good first issue as it is a substantial potential positive impact and relatively low complexity (add type aliases, update docs, write some tests) and would be a good way for someone to get started with DataFusion

@Kikkon
Copy link
Contributor

Kikkon commented Jul 17, 2023

If the issue is not urgent I'm happy to take it, I think it's a good opportunity to learn about DataFusion 🤣

@comphead
Copy link
Contributor

I think this would be very straightforward as long as the postgres type had a clear mapping to an Arrow type. For example int8 maps seemingly nicely to DataType::Int8

@alamb I'm not sure this is correct, as Int8 should map to BigInt, but in the example above it is SmallInt
https://www.postgresql.org/docs/current/datatype-numeric.html

@alamb
Copy link
Contributor

alamb commented Jul 17, 2023

Thank you @comphead -- you are right, INt8 seeme to mean 8 bytes in postgres (not 8 bits) so the correct mapping is probably more like DataType::Int64

Thank you for https://www.postgresql.org/docs/current/datatype-numeric.html which is the definitive reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants