-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(datasets): Added the Experimental ExternalTableDataset for Databricks #885
base: main
Are you sure you want to change the base?
feat(datasets): Added the Experimental ExternalTableDataset for Databricks #885
Conversation
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
809522d
to
46858c3
Compare
Hey @noklam, @ankatiyar, |
@MinuraPunchihewa Tests are super welcomed, although we don't have a databricks cluster to run these tests unless there are ways to run this locally. |
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
f860c5f
to
8354b00
Compare
@noklam I just added some tests to cover the logic specific to |
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
789c57d
to
31aecc8
Compare
Is this an expected fail? |
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
@noklam It has been my experience that when overwriting data to an external table of a format other than Delta, the location has to be provided. Since I am an external location fixture here which requires an environment variable called I should have specified a different format though. I just made a commit with that change. Can you give me an explanation of how these tests are running? There should be a Spark environment available for them to run at all, right? |
Since this is an experimental dataset, test will go to here: https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/kedro_datasets_experimental/tests This mean tests are not going to be run per every commit in CI. If you look at other Spark tests we run them in a local mode. With Databricks it's unclear whether it possible to run it in a local mode, especially with the platform UnityCatalog (which is different from the open source version). |
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
aa8c9e1
to
761f963
Compare
Got it. I just moved the tests; I had to copy the |
Description
This PR adds the
ExternalTableDataset
to support interactions with external tables in Databricks (Unity Catalog).Fixes #817
Development notes
The ExternalTableDataset has been implemented by extending the
BaseTableDataset
that was added here.These changes have been tested,
Checklist
RELEASE.md
file