Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a feature based object store provider #258

Merged
merged 1 commit into from
Sep 27, 2022

Conversation

yahoNanJing
Copy link
Contributor

@yahoNanJing yahoNanJing commented Sep 21, 2022

Which issue does this PR close?

Closes #257, #6.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@yahoNanJing
Copy link
Contributor Author

Hi @andygrove, @tustvold, @matthewmturner, could you help review this PR?

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have enough context on ballista to meaningfully review this, although its use of ObjectStoreProvider looks plausible.

apache/datafusion#3540 might also be related, a common way to register object stores based on environment variables seems sensible to me


/// An object store detector based on which features are enable for different kinds of object stores
pub struct FeatureBasedObjectStoreProvider;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a placeholder, it doesn't appear to be doing anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a follow-on PR #260 to implement this for HDFS but it is waiting on another PR to be merged

Copy link
Contributor Author

@yahoNanJing yahoNanJing Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, based on this PR, we can easily add feature-based object stores like #260 does. And we have one verified example without manual object store registration in https://github.com/datafusion-contrib/datafusion-objectstore-hdfs/blob/master/hdfs-examples/src/bin/ballista-sql.rs

@andygrove
Copy link
Member

cc @avantgardnerio who has been working on supporting custom object stores in DataFusion/Ballista

@andygrove
Copy link
Member

The approach here LGTM but I have not worked with the object store yet

@yahoNanJing
Copy link
Contributor Author

Let's merge it first. Will update the ObjectStoreProvider interface after upgrading datafusion to the latest version.

@yahoNanJing yahoNanJing merged commit bb2ea4f into apache:master Sep 27, 2022
@avantgardnerio
Copy link
Contributor

avantgardnerio commented Sep 27, 2022

It's unclear from skimming this: what is the intended usage? Should a statement like:

create external table customer stored as CSV DELIMITER '|' 
    location 's3://minioadmin:minioadmin@127.0.0.1:9000/tpch';

Work now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a feature based object store provider
5 participants