-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs #30482
Conversation
|
||
object PartitioningUtils { | ||
private[sql] object PartitioningUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed @cloud-fan 's comment #30454 (comment)
.asTableCatalog | ||
.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl")) | ||
.asPartitionable | ||
val expectedPartition = InternalRow.fromSeq(Seq[Any](null)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'__HIVE_DEFAULT_PARTITION__'
should be handled as null
Test build #131641 has finished for PR 30482 at commit
|
.asPartitionable | ||
val expectedPartition = InternalRow.fromSeq(Seq[Any](null)) | ||
assert(!partTable.partitionExists(expectedPartition)) | ||
val partSpec = "PARTITION (part0 = '__HIVE_DEFAULT_PARTITION__')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about it. It's more like a hive specific thing and we should let v2 implementation to decide how to handle null partition values. This should be internal details and shouldn't be exposed to end users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. How can users specify null
partition value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does part_col = null
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, if we have a string partitioned column - how could we distinguish null
from "null"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser should recognize different literals, e.g. part_col = null
and part_col = "null"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does part_col = null work?
I have checked that. null
is recognized as a string "null"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more like a hive specific thing and we should let v2 implementation to decide ...
It is already Spark specific thing too. Implementations don't see '__HIVE_DEFAULT_PARTITION__'
at all because it is replaced by null
at the analyzing phase.
Test build #131647 has finished for PR 30482 at commit
|
jenkins, retest this, please |
Test build #131694 has started for PR 30482 at commit |
jenkins, retest this, please |
Test build #131833 has finished for PR 30482 at commit
|
@cloud-fan Should I close this? |
Yea let's close it. |
What changes were proposed in this pull request?
sql.util.PartitioningUtils
- the methodcastPartitionValues()
.castPartitionValues()
from DSv2 resolver of partition specs -ResolvePartitionSpec
.Why are the changes needed?
To have the same behavior as DSv1 which interprets
__HIVE_DEFAULT_PARTITION__
asNULL
:Does this PR introduce any user-facing change?
Yes
How was this patch tested?
Add new test to
AlterTablePartitionV2SQLSuite
.