Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add array_has_duplicates Presto function #3320

Closed

Conversation

duanmeng
Copy link
Collaborator

@duanmeng duanmeng commented Nov 21, 2022

No description provided.

@netlify
Copy link

netlify bot commented Nov 21, 2022

Deploy Preview for meta-velox ready!

Name Link
🔨 Latest commit 73667f0
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/63897c4484f48500094ebcff
😎 Deploy Preview https://deploy-preview-3320--meta-velox.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 21, 2022
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duanmeng Any particular reason to implement this function as a vector function? It seems it can be implemented as a simple function, no?

@duanmeng
Copy link
Collaborator Author

Details
@mbasmanova I follow the guidance of developer guide, which mentioned "Simple functions process a single row and produce a single value as a result. Vector functions process a batch or rows and produce a vector of results. ". And I reference the implementation of ArrayDuplicates and ArrayContains.

@mbasmanova
Copy link
Contributor

@duanmeng I suggest to implement this function as a simple function. (ArrayDuplicates and ArrayContains may have been implemented before simple functions supported complex types. It would be nice to convert these to simple functions as well.)

@duanmeng
Copy link
Collaborator Author

@duanmeng I suggest to implement this function as a simple function. (ArrayDuplicates and ArrayContains may have been implemented before simple functions supported complex types. It would be nice to convert these to simple functions as well.)

@mbasmanova Got it, I will change my implementation.

@duanmeng
Copy link
Collaborator Author

@mbasmanova I've updated the implementation using simple functions, could you please help to review?

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duanmeng Thank you for iterating on this PR. Overall looks good.

Please, update the PR title to use the template from https://github.com/facebookincubator/velox/blob/main/CONTRIBUTING.md#prestos-sql-functions and add documentation.

Please, also run Fuzzer:

velox_expression_fuzzer_test --logtostderr=1 --velox_fuzzer_enable_complex_types --only array_has_duplicates --duration_sec 60

velox_expression_fuzzer_test --logtostderr=1 --velox_fuzzer_enable_complex_types --enable_variadic_signatures --duration_sec 600

velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved
velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved
velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved
velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved

auto array = makeNullableArrayVector<StringView>({
{},
{S("")},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ""_sv instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved
velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved
velox/functions/prestosql/tests/ArrayHasDuplicatesTest.cpp Outdated Show resolved Hide resolved
velox/functions/prestosql/tests/CMakeLists.txt Outdated Show resolved Hide resolved
@duanmeng duanmeng changed the title add prestosql function array_has_duplicates Add array_has_duplicates Presto function Nov 22, 2022
@duanmeng duanmeng force-pushed the array_has_duplicates branch 2 times, most recently from 2c0867a to cfeaf07 Compare November 22, 2022 14:16
@duanmeng
Copy link
Collaborator Author

duanmeng commented Nov 22, 2022

@mbasmanova I've resolved the comments and updated the PR title, and doc, and ran the two fuzzer tests following your guidance.

The partial log of the two tests

./_build/release/velox/expression/tests/velox_expression_fuzzer_test --logtostderr=1 --velox_fuzzer_enable_complex_types --only array_has_duplicates --duration_sec 60
./_build/release/velox/expression/tests/velox_expression_fuzzer_test --logtostderr=1 --velox_fuzzer_enable_complex_types --enable_variadic_signatures --duration_sec 600

Succeed

@duanmeng
Copy link
Collaborator Author

@mbasmanova Could you please help to take a look?

@duanmeng
Copy link
Collaborator Author

duanmeng commented Dec 2, 2022

@xiaoxmeng @Yuhta Could you help to review this pr, I've resolved all the comments from @mbasmanova, she seems to be on vacation.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duanmeng Looks good to me. Thank you for the contribution.

@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@duanmeng
Copy link
Collaborator Author

duanmeng commented Dec 2, 2022

@mbasmanova Thanks for your help on my first add function pr in velox:). Can I do something about the Facebook internal builds & tests?

});
auto expected = makeFlatVector<bool>({false, true, false, true});
testExpr(expected, "array_has_duplicates(C0)", {array});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, @duanmeng. I have closed my duplicate PR. Feel free to borrow more unit test cases, such as constant vectors, nonContiguousRows, here.

https://github.com/facebookincubator/velox/pull/3397/files#diff-198e97a86767f3f7b75862d0350895b93496dc376dbf8a1fe3bc384cec368076

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, @duanmeng. I have closed my duplicate PR. Feel free to borrow more unit test cases, such as constant vectors, nonContiguousRows, here.

https://github.com/facebookincubator/velox/pull/3397/files#diff-198e97a86767f3f7b75862d0350895b93496dc376dbf8a1fe3bc384cec368076

@darrenfu Thanks for your suggestion, I used to add the constant and non Contiguous Rows test case referring to the test case in Array Duplicates Test and removed them according to the comments "This test is redundant as this logic is processed by the framework".

@mbasmanova
Copy link
Contributor

@duanmeng Thank you for the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants