Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new table format presets. Add support for custom format #23

Merged
merged 4 commits into from
Dec 16, 2021

Conversation

Ciebiada
Copy link
Member

@Ciebiada Ciebiada commented Dec 14, 2021

Checklist

  • Bumped version tag in lib/egis/version.rb
  • Added changes summary to CHANGELOG

@Ciebiada Ciebiada self-assigned this Dec 14, 2021
@@ -3,6 +3,11 @@

## 1.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could use an update as well :)

@@ -3,6 +3,11 @@

## 1.0

### 2.0.0

- **[breaking change]** Make `:orc` table format to have `orc.column.index.access` disabled by default. `:orc_legacy` preset replaces the previous one
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **[breaking change]** Make `:orc` table format to have `orc.column.index.access` disabled by default. `:orc_legacy` preset replaces the previous one
- **[breaking change]** Make `:orc` table format use column names instead of indexes by default. `:orc_legacy` preset preserves existing behavior.

Copy link
Contributor

@mkrawc mkrawc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ciebiada I like the direction 👍

STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
SQL
when :orc_legacy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

legacy has a negative connotation. How about we call it explicitly like orc_with_index_access?

How about we also support json format, we use it in Socialguide A+ query

when :json
  'ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'


def serde_properties(format)
return '' unless format.key?(:serde_properties)
return "ROW FORMAT #{format}" if format.is_a?(String)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed the String option should also allow to define STORED AS option. I'd remove the "ROW FORMAT" from here and allow user to fill it in or not, for example if he just want to use "STORED AS PARQUET" option

@Ciebiada Ciebiada requested a review from mkrawc December 16, 2021 11:03
@Ciebiada Ciebiada changed the title Add support for serde input and output format Add new table format presets. Add support for custom format Dec 16, 2021
@Ciebiada Ciebiada merged commit 2ebce6c into master Dec 16, 2021
@Ciebiada Ciebiada deleted the add-support-for-input-output-format branch December 16, 2021 12:53
@Ciebiada
Copy link
Member Author

@mkrawc I've merged this. However. If you have any further comments please let me know and I will get back to it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants