Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry support to RowToColumnarIterator #9066

Merged
merged 3 commits into from
Aug 21, 2023

Conversation

firestarman
Copy link
Collaborator

close #8350

This PR adds in the retry support to RowToColumnarIterator.

This PR adds a new API named tryBuild in the GpuColumnarBatchBuilder class to support the retry mechanism.
Builders can build the columns only once. So this change holds the host columns built from the builders and reuses the host columns to build the columnar batch as many times as it needs.

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman
Copy link
Collaborator Author

build

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman
Copy link
Collaborator Author

build

try {
b.close();
} catch (Throwable e) {
/* ignore the exception */
Copy link
Collaborator

@abellina abellina Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this, and the next catch seem a bit odd. What exception were you seeing?

Should this instead use something like safeClose() (we only have that in scala right now) and produce a single exception after trying to close all, and returning a single exception object that has the individual exceptions supressed?

Copy link
Collaborator Author

@firestarman firestarman Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to make sure it will always try to close the HostColumns no matter if any exception happens when closing the builders. Something likes

      try {
        for (ai.rapids.cudf.HostColumnVector.ColumnBuilder b: builders) {
          if (b != null) {
            b.close();
          }
        }
      } finally {
        if (hostColumns != null) {
          for (ai.rapids.cudf.HostColumnVector hcv: hostColumns) {
            if (hcv != null) {
              hcv.close();
            }
          }
          hostColumns = null;
        }
      }

Is this OK ? @abellina @revans2, or do you want me to do as the following. Java requires explicit declaration for non runtime exceptions, so we only need to catch the runtime exceptions here.

      RuntimeException toThrow = null;
      for (ai.rapids.cudf.HostColumnVector.ColumnBuilder b: builders) {
        if (b != null) {
          try {
            b.close();
          } catch (RuntimeException e) {
            if (toThrow == null) {
              toThrow = e;
            } else {
              toThrow.addSuppressed(e);
            }
          }
        }
      }
      if (hostColumns != null) {
        for (ai.rapids.cudf.HostColumnVector hcv: hostColumns) {
          if (hcv != null) {
            try {
              hcv.close();
            } catch (RuntimeException e) {
              if (toThrow == null) {
                toThrow = e;
              } else {
                toThrow.addSuppressed(e);
              }
            }
          }
        }
        hostColumns = null;
      }
      if (toThrow != null) {
        throw toThrow;
      }
    }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picked the first one. If you prefer to the second option, I will update it accordingly.

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @abellina I am not a fan of throwing away an exception, especially a generic Throwable.

@sameerz sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Aug 17, 2023
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman
Copy link
Collaborator Author

build

@firestarman firestarman merged commit 5cafe66 into NVIDIA:branch-23.10 Aug 21, 2023
26 of 27 checks passed
@firestarman firestarman deleted the retry-row2col branch August 21, 2023 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add retry support to RowToColumnarIterator
4 participants