Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add Host Memory Retry Columnar To Row Conversion #8886

Open
2 tasks done
revans2 opened this issue Jul 31, 2023 · 0 comments
Open
2 tasks done

[FEA] Add Host Memory Retry Columnar To Row Conversion #8886

revans2 opened this issue Jul 31, 2023 · 0 comments
Assignees
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented Jul 31, 2023

Is your feature request related to a problem? Please describe.
GpuColumnarToRowExec is a little complicated because we have a number of different optimization around it. Ultimately we need a good way to limit the amount of host memory that it can use.

#9862 should give us hard limits, but we also want to be able to retry the allocation if it fails like with the GPU retry framework. This is to add in that retry where needed.

The accelerated transpose case (AcceleratedColumnarToRowIterator) converts the data to one or more HostColumnVector instances that are lists of bytes, which hold a row format similar to UnsafeRow in Spark.

The non-accelerated case (ColumnarToRowIterator) will just copy the data to the host and then walk that data one row at a time.

The primary goal here is to limit the host memory without deadlocking. The easiest way to do this would be to take the HostColumnVectors and make them spillable. Then we would get the wrapped column vector each time we needed to read some data and release it when we were done. This would work, but it is far from ideal, especially if we have a lot of columns and spilling is happening regularly.

It would be better if we could chunk the data on demand into smaller chunks.

For the accelerated case I think we could adjust the limits it currently has in place so we could have a target size that we pass down to the kernels. They would then return chunks of rows that are about that size. When we copy them back to the host, there would be more overhead on the heap to keep track of more objects, but it would also reduce the maximum amount of non-spillable memory that we have at any point in time, and it would also reduce the amount of memory that might need to be read back in each time.

For the non-accelerated case I think we would need a contig split like API or something similar. I am not super happy with the idea that we would need to do more computation, but perhaps we could do this dynamically, like we do for spill with bounce buffers and the chunked pack API there. This might need to be a follow on piece of work, as it is likely a lot more complicated.

Tasks

  1. test
  2. test
    gerashegalov
@revans2 revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Jul 31, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 8, 2023
@revans2 revans2 changed the title [FEA] Add Host Memory Limits for Columnar To Row Conversion [FEA] Add Host Memory Retry Columnar To Row Conversion Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing
Projects
None yet
Development

No branches or pull requests

3 participants