-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malformed array vector result from zip() in simplified path #3542
Comments
The bug is in the fast path of the velox/velox/functions/prestosql/Zip.cpp Lines 113 to 153 in a5bd212
This fast path creates a row vector of the size equals to the sum of input array sizes, and then create a result array vector on top of this row vector (as the elements vector) using the original offsets() buffer of the first input array vector. There are at least two problems: 1. the sum of input array sizes doesn't necessarily equal to the element vector sizes of the input arrays, and 2. the first input array's offsets() buffer doesn't necessarily align with the second input array's elements. |
Summary: The fuzzer discovered an issue with the Zip UDF facebookincubator#3542 I hadn't accounted for the fact that just because all of the arrays are flat and the same size, doesn't mean they all have the same offsets into their elements Vectors. E.g. with 3 arrays of size 3, the offsets could be 0, 3, 6 or 6, 3, 0 or 0, 4, 7 (if there are unused elements) When this is the case the fast path added to the Zip UDF produces incorrect results. The fix is just to validate the offsets are the same in addition to the sizes. Differential Revision: D42183237 fbshipit-source-id: 14b312ac3d8541ae531f440b123f13eb81d92f08
…cubator#3564) Summary: Pull Request resolved: facebookincubator#3564 The fuzzer discovered an issue with the Zip UDF facebookincubator#3542 I hadn't accounted for the fact that just because all of the arrays are flat and the same size, doesn't mean they all have the same offsets into their elements Vectors. E.g. with 3 arrays of size 3, the offsets could be 0, 3, 6 or 6, 3, 0 or 0, 4, 7 (if there are unused elements) When this is the case the fast path added to the Zip UDF produces incorrect results. The fix is just to validate the offsets are the same in addition to the sizes. Reviewed By: kagamiori Differential Revision: D42183237 fbshipit-source-id: 3093ab55d8c848e411772fd3bfdca32b38ce6671
Summary: Pull Request resolved: #3564 The fuzzer discovered an issue with the Zip UDF #3542 I hadn't accounted for the fact that just because all of the arrays are flat and the same size, doesn't mean they all have the same offsets into their elements Vectors. E.g. with 3 arrays of size 3, the offsets could be 0, 3, 6 or 6, 3, 0 or 0, 4, 7 (if there are unused elements) When this is the case the fast path added to the Zip UDF produces incorrect results. The fix is just to validate the offsets are the same in addition to the sizes. Reviewed By: kagamiori Differential Revision: D42183237 fbshipit-source-id: 5aa720d9fb10eaa777f186dcc08d056e639a8ee6
@kagamiori Is this fixed now? |
Yes. Closing this issue since it's fixed in #3564. |
Description
This is an error found by fuzzer when comparing the results from the common and simplified path. Results are array vectors and the result from the simplified path has rawOffsets_[i] exceeding the length of element vector.
Error Reproduction
Run the following command with the attached files:
velox/expression/tests/velox_expression_runner_test --fuzzer_repro_path /path/to/velox_expressionVerifier_8R09KJ --mode verify
velox_expressionVerifier_8R09KJ.zip
Relevant logs
The text was updated successfully, but these errors were encountered: