-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge branch-23.12 into branch-24.02 #14414
Merge branch-23.12 into branch-24.02 #14414
Conversation
Update the nvCOMP version used for cuIO compression/decompression to 3.0.4. Authors: - Vukasin Milovanovic (https://github.com/vuule) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#13815
All of these wrappers have now been upstreamed into Cython as of Cython 3.0.3. Contributes to rapidsai#14023 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) - Jake Awe (https://github.com/AyodeAwe) URL: rapidsai#14382
Creates a normalizing offsets iterator that returns an int64 value given either a int32 or int64 column data. Depends on rapidsai#14206 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Divye Gala (https://github.com/divyegala) - Yunsong Wang (https://github.com/PointKernel) URL: rapidsai#14234
…rapidsai#14364) * Update dependency lists * Update wheel building to stop needing manual installations * Update wheel dependency with alpha spec * Rename the package * Update update-version.sh * Update conda/recipes/dask-cudf/meta.yaml Co-authored-by: GALI PREM SAGAR <sagarprem75@gmail.com> * Make pip/conda dependencies consistent and fix recipe * dfg * Apply suggestions from code review --------- Co-authored-by: GALI PREM SAGAR <sagarprem75@gmail.com>
…sai#14399) Corrects failures seen in C++ CI where libnvbench.so can't be found Authors: - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: rapidsai#14399
…14388) Closes rapidsai#14384. `x.startswith(y)` is not a good enough check for if `x` is a subdirectory of `y`. It causes `pandasai` to be reported as a sub-package of `pandas`. Authors: - Ashwin Srinath (https://github.com/shwina) Approvers: - https://github.com/brandon-b-miller URL: rapidsai#14388
Refactor the currently outdated cudf_kafka build setup to use skbuild instead. Authors: - Jeremy Dyer (https://github.com/jdye64) - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) - AJ Schmidt (https://github.com/ajschmidt8) URL: rapidsai#14292
Adds a new BytePairEncoding class to cuDF ``` >>> import cudf >>> from cudf.core.byte_pair_encoding import BytePairEncoder >>> mps = cudf.read_text('merges.txt', delimiter='\n', strip_delimiters=True) >>> bpe = BytePairEncoder(mps) >>> str_series = cudf.Series(['This is a sentence', 'thisisit']) >>> bpe(str_series) 0 This is a sent ence 1 this is it dtype: object ``` This class wraps the existing `nvtext::byte_pair_encoding` APIs to load the merge-pairs data and encode a column of strings. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#13891
…4393) Fixes a bug introduced in rapidsai#14336 when trying to simplify the token-counting logic as per this discussion rapidsai#14336 (comment) The simplification caused an error which was found when running the nvtext benchmarks. The appropriate gtest has been updated to cover this case now. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: rapidsai#14393
This PR switches remaining usages of `dask` dependencies to use `rapids-dask-dependency` Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Bradley Dice (https://github.com/bdice) - Jake Awe (https://github.com/AyodeAwe) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#14407
This PR contributes to rapidsai#13744. -Added stream parameters to public APIs `cudf::io::read_csv` `cudf::io::write_csv` -Added stream gtests Authors: - https://github.com/shrshi - Karthikeyan (https://github.com/karthikeyann) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Vukasin Milovanovic (https://github.com/vuule) - Yunsong Wang (https://github.com/PointKernel) URL: rapidsai#14340
…ai#14411) Port NVIDIA/nvbench#148 to cudf so that nvbench benchmarks work now that we always use a static version of nvbench. Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#14411
…apidsai#14390) Noticed this while trying to clean up `as_column` Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#14390
…idsai#14367) Issue rapidsai#14325 Use uint when reading/writing nano stats because nanoseconds have int32 encoding (different from both unit32 and sint32, _obviously_), which does not use zigzag. sint32 uses zigzag, and unit32 does not allow negative numbers, so we can use uint since we'll never have negative nanoseconds. Also disabled the nanoseconds because it should only be written after ORC-135; we don't write the version so readers get confused if nanoseconds are there. Planning to re-enable once we start writing the version. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#14367
Fixes: rapidsai#14398 This PR raises an error in `reindex` API when reindexing is performed on a non-unique index column. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#14400
c94a1f3
to
86fe759
Compare
86fe759
to
0e4851b
Compare
Marking with |
We need kvikio's PR to merge first so that we can get kvikio builds up for CI to run here. |
There's an error in this forward merger in the cudf-kafka CMakeLists.txt. I'm not sure if that's the only issue, so I'm going to try and do another version of the forward merge myself then diff against this PR to verify. |
@bdice wants to make the PR, so I'm just going to verify that I see the differences I expect when he's done. |
closing for #14422 |
For the record, I diffed the two PR branches and @bdice's has the change I was expecting to see:
namely the |
Whoops just realized that the |
Done: e4e6975 |
Description
This PR resolves conflicts in #14406
Checklist