-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support comparing ORC data #1545
Conversation
Signed-off-by: Allen Xu <allxu@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nits
path: String => spark.read.csv(path) | ||
case "parquet" => | ||
path: String => spark.read.parquet(path) | ||
case "orc" => | ||
path: String => spark.read.orc(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this block L523-L530 can be a one-liner:
val readPathAction = (path: String) => spark.read.format(inputFormat).load(path)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
@@ -53,6 +53,8 @@ object CompareResults { | |||
(spark.read.csv(conf.input1()), spark.read.csv(conf.input2())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly lines https://github.com/NVIDIA/spark-rapids/pull/1545/files#diff-bc99ae208fb44d4abcf91b6e5b44515fbebeb90367c5f9eb1719e28226d22bf9R51-R68
can be
val format = spark.read.format(conf.inputFormat())
BenchUtils.compareResults(
format.load(conf.input1()),
format.load(conf.input2()),
conf.inputFormat(),
conf.ignoreOrdering(),
conf.useIterator(),
conf.maxErrors(),
conf.epsilon())
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
integration_tests/src/main/scala/com/nvidia/spark/rapids/tests/common/BenchUtils.scala
Show resolved
Hide resolved
integration_tests/src/main/scala/com/nvidia/spark/rapids/tests/common/CompareResults.scala
Show resolved
Hide resolved
Signed-off-by: Allen Xu <allxu@nvidia.com>
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Support comparing ORC data Signed-off-by: Allen Xu <allxu@nvidia.com> * clean code * Add 2021 copyright Signed-off-by: Allen Xu <allxu@nvidia.com> * fix bug Co-authored-by: Allen Xu <allxu@nvidia.com>
* Support comparing ORC data Signed-off-by: Allen Xu <allxu@nvidia.com> * clean code * Add 2021 copyright Signed-off-by: Allen Xu <allxu@nvidia.com> * fix bug Co-authored-by: Allen Xu <allxu@nvidia.com>
…IDIA#1545) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Signed-off-by: Allen Xu allxu@nvidia.com
This is to resolve #1544