Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-7830] Add predicate filter pruning for snapshot queries in hudi related sources #11396

Merged
merged 1 commit into from
Sep 19, 2024

Conversation

vinishjail97
Copy link
Contributor

@vinishjail97 vinishjail97 commented Jun 5, 2024

Change Logs

Add new method getNextCheckpointWithPredicates in abstract class SnapshotLoadQuerySplitter for retrieving the next checkpoint along with file pruning predicates (partition filters etc.) for optimising snapshot query. These predicates help us in identifying the parquet files as part of the buildScan stage in spark data source.

Impact

No impact to existing API, new functionality being added for retrieving the next checkpoint with predicate filters. Null or Empty predicates are handled to ensure backwards compatibility.

Risk level (write none, low medium or high below)

Medium

Documentation Update

None.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Jun 5, 2024
@github-actions github-actions bot added size:S PR with lines of changes in (10, 100] size:M PR with lines of changes in (100, 300] and removed size:M PR with lines of changes in (100, 300] size:S PR with lines of changes in (10, 100] labels Jun 5, 2024
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua yihua merged commit 7a242fe into apache:master Sep 19, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:M PR with lines of changes in (100, 300]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants