Skip to content

Commit

Permalink
Pin to aws-sdk-cpp<1.11 (rapidsai#14173)
Browse files Browse the repository at this point in the history
Pin conda packages to `aws-sdk-cpp<1.11`. The recent upgrade in version `1.11.*` has caused several issues with cleaning up (more details on changes can be read in [this link](https://github.com/aws/aws-sdk-cpp#version-111-is-now-available)), leading to Distributed and Dask-CUDA processes to segfault. The stack for one of those crashes looks like the following:

```
(gdb) bt
#0  0x00007f5125359a0c in Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int) () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so
#1  0x00007f5124968f83 in aws_event_loop_thread () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-io.so.1.0.0
#2  0x00007f5124ad9359 in thread_fn () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1
#3  0x00007f519958f6db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4  0x00007f5198b1361f in clone () from /lib/x86_64-linux-gnu/libc.so.6
```

Such segfaults now manifest frequently in CI, and in some cases are reproducible with a hit rate of ~30%. Given the approaching release time, it's probably the safest option to just pin to an older version of the package while we don't pinpoint the exact cause for the issue and a patched build is released upstream.

The `aws-sdk-cpp` is statically-linked in the `pyarrow` pip package, which prevents us from using the same pinning technique. cuDF is currently pinned to `pyarrow=12.0.1` which seems to be built against `aws-sdk-cpp=1.10.*`, as per [recent build logs](https://github.com/apache/arrow/actions/runs/6276453828/job/17046177335?pr=37792#step:6:1372).

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#14173
  • Loading branch information
pentschev authored Sep 22, 2023
1 parent a6d014e commit 40bdd8a
Show file tree
Hide file tree
Showing 5 changed files with 8 additions and 0 deletions.
1 change: 1 addition & 0 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ channels:
- nvidia
dependencies:
- aiobotocore>=2.2.0
- aws-sdk-cpp<1.11
- benchmark==1.8.0
- boto3>=1.21.21
- botocore>=1.24.21
Expand Down
1 change: 1 addition & 0 deletions conda/environments/all_cuda-120_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ channels:
- nvidia
dependencies:
- aiobotocore>=2.2.0
- aws-sdk-cpp<1.11
- benchmark==1.8.0
- boto3>=1.21.21
- botocore>=1.24.21
Expand Down
3 changes: 3 additions & 0 deletions conda/recipes/libcudf/conda_build_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ gbench_version:
gtest_version:
- ">=1.13.0"

aws_sdk_cpp_version:
- "<1.11"

libarrow_version:
- "=12"

Expand Down
2 changes: 2 additions & 0 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ requirements:
- gtest {{ gtest_version }}
- gmock {{ gtest_version }}
- zlib {{ zlib_version }}
- aws-sdk-cpp {{ aws_sdk_cpp_version }}

outputs:
- name: libcudf
Expand Down Expand Up @@ -107,6 +108,7 @@ outputs:
- dlpack {{ dlpack_version }}
- gtest {{ gtest_version }}
- gmock {{ gtest_version }}
- aws-sdk-cpp {{ aws_sdk_cpp_version }}
test:
commands:
- test -f $PREFIX/lib/libcudf.so
Expand Down
1 change: 1 addition & 0 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ dependencies:
- libkvikio==23.10.*
- output_types: conda
packages:
- aws-sdk-cpp<1.11
- fmt>=9.1.0,<10
- &gbench benchmark==1.8.0
- &gtest gtest>=1.13.0
Expand Down

0 comments on commit 40bdd8a

Please sign in to comment.