Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v22.12 #12200

Merged
merged 252 commits into from
Dec 8, 2022
Merged
Show file tree
Hide file tree
Changes from 225 commits
Commits
Show all changes
252 commits
Select commit Hold shift + click to select a range
ba9c43c
Merge pull request #11757 from rapidsai/branch-22.10
GPUtester Sep 23, 2022
7376f1f
Merge pull request #11758 from rapidsai/branch-22.10
GPUtester Sep 23, 2022
59847c1
Merge pull request #11763 from rapidsai/branch-22.10
GPUtester Sep 24, 2022
41474af
Merge pull request #11767 from rapidsai/branch-22.10
GPUtester Sep 26, 2022
5fb657d
Merge pull request #11773 from rapidsai/branch-22.10
GPUtester Sep 26, 2022
2b94483
Merge pull request #11774 from rapidsai/branch-22.10
GPUtester Sep 26, 2022
7d40b30
Merge pull request #11775 from rapidsai/branch-22.10
GPUtester Sep 26, 2022
a1cbb02
Merge pull request #11776 from rapidsai/branch-22.10
GPUtester Sep 26, 2022
aa2ef0e
Merge pull request #11777 from rapidsai/branch-22.10
GPUtester Sep 26, 2022
cc6f237
Merge pull request #11781 from rapidsai/branch-22.10
GPUtester Sep 27, 2022
cc97584
Merge pull request #11782 from rapidsai/branch-22.10
GPUtester Sep 27, 2022
1d7af9e
Merge pull request #11784 from rapidsai/branch-22.10
GPUtester Sep 27, 2022
b8ab576
Merge pull request #11786 from rapidsai/branch-22.10
GPUtester Sep 27, 2022
54480f3
Merge branch-22.10 into branch 22.12
davidwendt Sep 28, 2022
f72c4ce
add change from 11771
davidwendt Sep 28, 2022
017d85f
Merge pull request #11801 from davidwendt/branch-22.12-merge-22.10
raydouglass Sep 28, 2022
479514e
Merge pull request #11805 from rapidsai/branch-22.10
GPUtester Sep 28, 2022
5cf7fdf
Merge pull request #11806 from rapidsai/branch-22.10
GPUtester Sep 28, 2022
97353fc
Merge pull request #11809 from rapidsai/branch-22.10
GPUtester Sep 28, 2022
69a031c
Merge pull request #11810 from rapidsai/branch-22.10
GPUtester Sep 28, 2022
90afe92
Merge pull request #11820 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
ec4cdd8
Fix compile warning from CUDF_FUNC_RANGE in a member function (#11798)
davidwendt Sep 29, 2022
87d0387
Merge pull request #11821 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
0ecbaa1
Merge pull request #11823 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
59ce915
Merge pull request #11829 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
c8c9027
Merge pull request #11830 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
3c9f9cf
Merge pull request #11831 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
cb81ebc
Merge pull request #11832 from rapidsai/branch-22.10
GPUtester Sep 29, 2022
71167d7
Merge pull request #11839 from rapidsai/branch-22.10
GPUtester Sep 30, 2022
8df6dbf
Merge pull request #11851 from rapidsai/branch-22.10
GPUtester Oct 3, 2022
5000e94
Merge pull request #11852 from rapidsai/branch-22.10
GPUtester Oct 3, 2022
0b28d34
Remove `cudf_io` namespace alias (#11827)
vuule Oct 3, 2022
ba0febe
Test/remove thrust vector usage (#11813)
vyasr Oct 4, 2022
5e42c2d
Use conda-forge's `pyorc` (#11855)
jakirkham Oct 4, 2022
7d173c9
Update cudf JNI version to 22.12.0-SNAPSHOT (#11764)
pxLi Oct 4, 2022
0fb4d76
Remove unused includes for table/row_operators (#11857)
Oct 4, 2022
0d38a78
Merge pull request #11866 from rapidsai/branch-22.10
GPUtester Oct 5, 2022
001aede
JNI Avoid NPE for reading host binary data (#11865)
revans2 Oct 5, 2022
6d18543
Unpin `dask` and `distributed` for development (#11859)
galipremsagar Oct 5, 2022
4525474
Parquet reader: bug fix for a num_rows/skip_rows corner case, w/optim…
nvdbaranec Oct 5, 2022
029b1db
Fix RangeIndex unary operators. (#11868)
vyasr Oct 6, 2022
e323f0a
Fix make_column_from_scalar for all-null strings column (#11807)
davidwendt Oct 6, 2022
1ef722d
Fix decimal benchmark input data generation (#11863)
karthikeyann Oct 6, 2022
e20eb94
part1: Simplify BaseIndex to an abstract class (#10389)
skirui-source Oct 6, 2022
eb0e4b6
Merge pull request #11876 from rapidsai/branch-22.10
GPUtester Oct 7, 2022
4c4acd5
Add BGZIP reader to python `read_text` (#11802)
upsj Oct 7, 2022
fc5b675
Merge pull request #11881 from rapidsai/branch-22.10
GPUtester Oct 8, 2022
4eb9c6c
Add BGZIP multibyte_split benchmark (#11723)
upsj Oct 10, 2022
586907b
Fix pre-commit copyright check (#11860)
galipremsagar Oct 10, 2022
5b51591
Remove "experimental" warning for struct columns in ORC reader and wr…
vuule Oct 10, 2022
26f3e76
ArrowIPCTableWriter writes en empty batch in the case of an empty tab…
firestarman Oct 11, 2022
566b3d1
Conform "bench_isin" to match generator column names (#11549)
Oct 11, 2022
9ba6142
Use public APIs in STREAM_COMPACTION_NVBENCH (#11892)
Oct 11, 2022
a921f5d
Error on `ListColumn` or any new unsupported column in `cudf.Index` (…
galipremsagar Oct 11, 2022
7032cc3
Add coverage for string UDF tests. (#11891)
vyasr Oct 11, 2022
387192c
Add ngroup (#11871)
shwina Oct 11, 2022
ccbd852
Change expect_strings_empty into expect_column_empty libcudf test uti…
davidwendt Oct 12, 2022
75a6973
Relax `codecov` threshold diff (#11899)
galipremsagar Oct 12, 2022
8b5ab23
Fix memcheck error in TypeInference.Timestamp gtest (#11905)
davidwendt Oct 12, 2022
3226859
Fix memcheck error in get_dremel_data (#11903)
davidwendt Oct 12, 2022
0ca68c7
Add thrust output iterator fix (1805) to thrust.patch (#11900)
davidwendt Oct 13, 2022
678946b
Fix segmented-sort to ignore indices outside the offsets (#11888)
davidwendt Oct 13, 2022
fb0922f
Fix an issue reading struct-of-list types in Parquet. (#11910)
nvdbaranec Oct 13, 2022
662f309
Fixes Unsupported column type error due to empty list columns in Nest…
karthikeyann Oct 13, 2022
c824fee
Add clear indication of non-GPU accelerated parameters in read_json d…
Oct 13, 2022
e91d7d9
Reduce memory usage in nested JSON parser - tree generation (#11864)
karthikeyann Oct 14, 2022
8a31e26
Fix local offset handling in bgzip reader (#11918)
upsj Oct 14, 2022
7598253
Add libcudf strings examples (#11849)
davidwendt Oct 14, 2022
c265c58
Fix cudf::stable_sorted_order for NaN and -NaN in FLOAT64 columns (#1…
davidwendt Oct 14, 2022
9f8b936
Handle `multibyte_split` byte_range out-of-bounds offsets on host (#1…
upsj Oct 15, 2022
edc058f
Add `nanosecond` & `microsecond` to `DatetimeProperties` (#11911)
galipremsagar Oct 17, 2022
afa16b4
Fix documentation referring to removed as_gpu_matrix method. (#11937)
bdice Oct 18, 2022
a926c52
Add `.str.find_multiple` API (#11928)
galipremsagar Oct 18, 2022
cea10ca
Pin mimesis version in setup.py. (#11906)
bdice Oct 18, 2022
1effe19
Removing int8 column option from parquet byte_array writing (#11539)
hyperbolic2346 Oct 18, 2022
5d57159
Initial draft of policies and guidelines for libcudf usage. (#11853)
vyasr Oct 18, 2022
425fb02
Update flake8 to 5.0.4 and use flake8-force to check Cython. (#11736)
bdice Oct 18, 2022
6ca2ceb
Adds retryCount to RmmEventHandler.onAllocFailure (#11940)
abellina Oct 18, 2022
08e4ec2
Refactor pad/zfill functions for reuse with strings udf (#11914)
davidwendt Oct 19, 2022
08ffecc
Fix some gtests incorrectly coded in namespace cudf::test (part I) (#…
davidwendt Oct 19, 2022
416d4d5
Enable backend dispatching for Dask-DataFrame creation (#11920)
rjzamora Oct 20, 2022
ff41841
Remove validation that requires introspection (#11938)
vyasr Oct 20, 2022
536ddd0
Tell jitify_preprocess where to search for libnvrtc (#11787)
robertmaynard Oct 20, 2022
98185fe
Fix writing of Parquet files with many fragments (#11869)
etseidl Oct 20, 2022
ee9ffd0
Default to equal NaNs in make_collect_set_aggregation. (#11621)
bdice Oct 20, 2022
5803015
Rename libcudf++ to libcudf. (#11953)
bdice Oct 20, 2022
b9ba9e3
Update Unit Testing in libcudf guidelines to code tests outside the c…
davidwendt Oct 21, 2022
dec8bde
Add tests ensuring that cudf's default stream is always used (#11875)
vyasr Oct 21, 2022
9c06330
Accept const refs instead of const unique_ptr refs in reduce and scan…
vyasr Oct 21, 2022
7940b5b
Fix maximum page size estimate in Parquet writer (#11962)
vuule Oct 21, 2022
f1ab5e9
add V2 page header support to parquet reader (#11778)
etseidl Oct 21, 2022
5c2150e
Default to equal NaNs in make_merge_sets_aggregation. (#11952)
bdice Oct 21, 2022
5a190b9
Switch over to rapids-cmake patches for thrust (#11921)
robertmaynard Oct 24, 2022
4c0f2fd
Fix lists and structs gtests coded in namespace cudf::test (#11956)
davidwendt Oct 24, 2022
c806b10
Use gather-based strings factory in cudf::strings::strip (#11954)
davidwendt Oct 24, 2022
1e93af8
Add gpu memory watermark apis to JNI (#11950)
abellina Oct 24, 2022
11918ae
Add dtype docs pages and docstrings for `cudf` specific dtypes (#11974)
galipremsagar Oct 24, 2022
2ee41d0
Replace most of preprocessor usage in nvcomp adapter with `constexpr`…
vuule Oct 25, 2022
dc5924c
Add pool memory resource to libcudf basic example (#11966)
davidwendt Oct 25, 2022
2d89f43
Add missing noexcepts to column_in_metadata methods (#11973)
vyasr Oct 25, 2022
285cb9e
Replace default_stream_value with get_default_stream in docs. (#11985)
vyasr Oct 25, 2022
a37f27b
Ensure better compiler cache results between cudf cal-ver branches (#…
robertmaynard Oct 25, 2022
ffd130a
Remove stale labeler (#11995)
raydouglass Oct 25, 2022
6a5c77b
Minor cleanup of root CMakeLists.txt for better organization (#11988)
robertmaynard Oct 25, 2022
5bfc9a4
Move protobuf compilation to CMake (#11986)
vyasr Oct 25, 2022
6b9c026
Use rapids-cmake for google benchmark. (#11997)
vyasr Oct 25, 2022
b7d0115
Switch to DISABLE_DEPRECATION_WARNINGS to match other RAPIDS projects…
robertmaynard Oct 25, 2022
b89c0e2
Add inplace arithmetic operators to `MaskedType` (#11987)
brandon-b-miller Oct 26, 2022
c146d21
Revert "Replace most of preprocessor usage in nvcomp adapter with `co…
vuule Oct 26, 2022
fac35b4
Fix some libcudf calls to cudf::detail::gather (#11963)
davidwendt Oct 26, 2022
72572a8
Determine if Arrow has S3 support at runtime in unit test. (#11560)
bdice Oct 26, 2022
07eb723
Feature/remove default streams (#11967)
vyasr Oct 26, 2022
646a7e3
Fix doxygen text for cudf::dictionary::encode (#11991)
davidwendt Oct 26, 2022
cd21ce7
Remove unnecessary code from dask-cudf _Frame (#12001)
rjzamora Oct 27, 2022
8d49db5
Ignore python docs build artifacts (#12000)
galipremsagar Oct 27, 2022
b4ca894
Add `strip_delimiters` option to `read_text` (#11946)
upsj Oct 27, 2022
43eb7a0
Refactor multibyte_split `output_builder` (#11945)
upsj Oct 27, 2022
bac2004
Add pivot_table and crosstab to docs. (#12014)
bdice Oct 27, 2022
1b1ca7c
Provide `data_chunk_source` wrapper for `datasource` (#11886)
upsj Oct 27, 2022
f17ea94
Fix bug where `df.loc` resulting in single row could give wrong index…
eriknw Oct 27, 2022
69fac8a
Remove unused `managed_allocator` (#12005)
vyasr Oct 27, 2022
1017045
Add DataFrame.pivot_table. (#12015)
bdice Oct 28, 2022
ee53458
New GHA to add issues/prs to project board (#12016)
jarmak-nv Oct 28, 2022
c915523
Add deprecation warning for set_allocator. (#11958)
vyasr Oct 28, 2022
aaf251d
Performance improvement in JSON Tree traversal (#11919)
karthikeyann Oct 28, 2022
7620fb1
Add method argument to DataFrame.quantile (#11957)
rjzamora Oct 28, 2022
0603167
Add cython-lint to pre-commit checks. (#12020)
bdice Oct 28, 2022
1c057bc
Use pragma once (#12019)
bdice Oct 31, 2022
f0b4c4f
Pass column names to `write_csv` instead of `table_metadata` pointer …
vuule Oct 31, 2022
a5aaa52
Remove default parameters for cudf::dictionary::detail functions (#12…
davidwendt Nov 1, 2022
991c86b
Remove default parameters for nvtext::detail functions (#12007)
davidwendt Nov 1, 2022
7af461c
Update cuda-python dependency to 11.7.1 (#12030)
galipremsagar Nov 1, 2022
d236779
Reduce/Remove reliance on `**kwargs` and `*args` in `IO` readers & wr…
galipremsagar Nov 1, 2022
41fca6e
Add `read_orc_metadata` to libcudf (#11815)
vuule Nov 1, 2022
2fe06bc
Leverage rapids_cython for more automated RPATH handling (#11996)
vyasr Nov 1, 2022
80c238c
Fix black exclusions. (#12036)
bdice Nov 1, 2022
f19bdbc
Remove smart quotes from all docstrings. (#12035)
bdice Nov 1, 2022
f3bf872
Merge branch 'branch-22.10' into branch-22.12-merge-22.10
vyasr Nov 1, 2022
1c2ad6a
Fix Parquet support for seconds and milliseconds duration types (#11854)
vuule Nov 1, 2022
c04dbef
Merge pull request #12045 from vyasr/branch-22.12-merge-22.10
msadang Nov 1, 2022
ac3f205
Port thrust's pinned_allocator to cudf, since Thrust 1.17 removes the…
robertmaynard Nov 1, 2022
03034af
Standardize newlines at ends of files. (#12042)
bdice Nov 1, 2022
a20bbfb
Trim trailing whitespace from all files. (#12041)
bdice Nov 2, 2022
5ace809
Add strings udf C++ classes and functions for phase II (#11912)
davidwendt Nov 2, 2022
d6a9e4a
Rollback of `DeviceBufferLike` (#12009)
madsbk Nov 2, 2022
a3d2276
Fixes bug in csv_reader_options construction in cython (#12021)
karthikeyann Nov 2, 2022
49fc3c7
Enable CEC for `strings_udf` (#11884)
brandon-b-miller Nov 2, 2022
856ac3f
Add full page indexes to Parquet writer benchmarks (#11955)
etseidl Nov 2, 2022
d949cd2
Make all `nvcc` warnings into errors (#8916)
trxcllnt Nov 2, 2022
eaa0706
Add developer docs for writing tests (#11199)
vyasr Nov 3, 2022
e402448
Trim quotes for non-string values in nested json parsing (#11898)
karthikeyann Nov 3, 2022
baa645d
Add strings `like` jni and native method (#12032)
cindyyuanjiang Nov 3, 2022
b156c25
Add `memory_usage` & `items` implementation for `Struct` column & dty…
galipremsagar Nov 3, 2022
2a58ff6
Force using old fmt in nvbench. (#12067)
vyasr Nov 4, 2022
1d6931a
Allow falling back to `shim_60.ptx` by default in `strings_udf` (#12056)
brandon-b-miller Nov 4, 2022
0278485
Remove default parameters for cudf::strings::detail functions (#12003)
davidwendt Nov 4, 2022
b1c2520
Remove overflow error during decimal binops (#12063)
galipremsagar Nov 4, 2022
e788f36
Fixes List offset bug in Nested JSON reader (#12060)
karthikeyann Nov 4, 2022
a3e9c1c
Mark nvcomp zstd compression stable (#12059)
jbrennan333 Nov 4, 2022
6e13139
Add debug-only onAllocated/onDeallocated to RmmEventHandler (#12054)
abellina Nov 4, 2022
9df2eba
Adding feature Truncate to DataFrame and Series (#11435)
VamsiTallam95 Nov 4, 2022
11b875b
Fix type casting in Series.__setitem__ (#11904)
wence- Nov 4, 2022
52dbb63
Fix link to c++ developer guide from `CONTRIBUTING.md` (#12084)
brandon-b-miller Nov 7, 2022
262631b
Fix ingest_raw_data performance issue in Nested JSON reader due to RV…
karthikeyann Nov 7, 2022
17b6b2e
Add checks for HLG layers in dask-cudf groupby tests (#10853)
charlesbluca Nov 7, 2022
f9a2512
Fix quantile gtests coded in namespace cudf::test (#12049)
davidwendt Nov 7, 2022
a72627a
Throw an error when libcudf is built without cuFile and `LIBCUDF_CUFI…
vuule Nov 7, 2022
ec46e7f
Move and update `dask` nigthly install in CI (#12082)
galipremsagar Nov 7, 2022
2ced214
Use nosync policy in gather and scatter implementations. (#12038)
bdice Nov 7, 2022
b16b4ff
Remove macros that inspect the contents of exceptions (#12076)
vyasr Nov 8, 2022
35077f5
Enable returning string data from UDFs used through `apply` (#11933)
brandon-b-miller Nov 8, 2022
c900fed
Bifurcate Dependency Lists [skip-gpuci] (#11674)
bdice Nov 8, 2022
8ee5f51
Enable building against the libarrow contained in pyarrow (#12034)
vyasr Nov 8, 2022
7535f31
Remove CUDA 10 compatibility code. (#12088)
bdice Nov 8, 2022
628cd4f
Change cudf::detail::tdigest to cudf::tdigest::detail (#12050)
davidwendt Nov 9, 2022
74053f4
Add regex_program class for use with all regex APIs (#11927)
davidwendt Nov 9, 2022
a2c428c
Fix an error in IO with `GzipFile` type (#12085)
galipremsagar Nov 9, 2022
26d449c
Update Numba docs links. (#12107)
bdice Nov 9, 2022
fbac4b4
Add `truncate` API to python doc pages (#12109)
galipremsagar Nov 9, 2022
6f78e74
Expose engine argument in dask_cudf.read_json (#12101)
rjzamora Nov 9, 2022
4de279d
Fix reading of CSV files with blank second row (#12098)
vuule Nov 9, 2022
59bd5c3
Support `strip`, `lstrip`, and `rstrip` in `strings_udf` (#12091)
brandon-b-miller Nov 10, 2022
4497ed6
Workaround groupby aggregate thrust::copy_if overflow (#12079)
davidwendt Nov 10, 2022
8ca2bd9
First pass of `pd.read_orc` changes in tests (#12103)
galipremsagar Nov 10, 2022
b3429fb
Remove "Multi-GPU with Dask-cuDF" notebook. (#12095)
bdice Nov 10, 2022
b30664b
Fix conditional_full_join benchmark (#12121)
Nov 10, 2022
7f2a471
Fix regex working-memory-size refactor error (#12119)
davidwendt Nov 10, 2022
70c7b7a
Refactor Parquet reader (#12046)
ttnghia Nov 10, 2022
f87d2b4
Add symlinks to notebooks. (#12128)
bdice Nov 11, 2022
3894427
Add JNI for `substring` without 'end' parameter. (#12113)
firestarman Nov 11, 2022
d335aa3
Fix alignment of compressed blocks in ORC writer (#12077)
vuule Nov 11, 2022
8668752
Adds an EventHandler to Java MemoryBuffer to be invoked on close (#12…
abellina Nov 11, 2022
825f049
Fix singleton-range `__setitem__` edge case (#12075)
wence- Nov 14, 2022
5081fb1
Enable automatic column projection in groupby().agg (#12124)
rjzamora Nov 14, 2022
b20a6e6
Add support for `DataFrame.from_dict`\`to_dict` and `Series.to_dict` …
galipremsagar Nov 14, 2022
b2e5069
Create an `int8` column in `read_csv` when all elements are missing (…
vuule Nov 15, 2022
fd488cd
Cleanup common parsing code in JSON, CSV reader (#12022)
karthikeyann Nov 15, 2022
bae9e39
Fix/disable jitify lto (#12122)
robertmaynard Nov 15, 2022
186e129
Add in negative size checks for columns (#12118)
revans2 Nov 15, 2022
4b7f5a7
Safely allocate `udf_string` pointers in `strings_udf` (#12138)
brandon-b-miller Nov 15, 2022
98880d2
Update cp.clip call (#12148)
quasiben Nov 15, 2022
90f0a77
Accelerate libcudf segmented sort with CUB segmented sort (#11969)
davidwendt Nov 15, 2022
414140b
check number of rows on empty data
vuule Nov 16, 2022
c574ddf
Fix decimal binary operations (#12142)
galipremsagar Nov 16, 2022
a8c0f4b
Fix type promotion edge cases in numerical binops (#12074)
wence- Nov 16, 2022
38235de
pin dask
galipremsagar Nov 16, 2022
7adf229
Update build.sh
galipremsagar Nov 16, 2022
742093e
Support `+` in `strings_udf` (#12117)
brandon-b-miller Nov 16, 2022
6ad5752
Use rapidsai CODE_OF_CONDUCT.md (#12166)
bdice Nov 16, 2022
defad5e
byte_range support for JSON Lines format (#12017)
karthikeyann Nov 16, 2022
8d84f2d
Merge branch 'rapidsai:branch-22.12' into pin_dask
galipremsagar Nov 16, 2022
afb3c97
Support nested types as groupby keys in libcudf (#11792)
PointKernel Nov 16, 2022
95a348b
Spilling to host memory (#12106)
madsbk Nov 16, 2022
73d73a7
Refactor `purge_nonempty_nulls` (#12111)
ttnghia Nov 16, 2022
ae101cc
Don't rely on GNU find in headers_test.sh (#12164)
wence- Nov 16, 2022
ce97a54
Merge branch 'branch-22.12' of https://github.com/rapidsai/cudf into …
vuule Nov 17, 2022
6de2c4e
Fix issues when both `usecols` and `names` options are used in `read_…
vuule Nov 17, 2022
aa13b95
Support `upper` and `lower` in `strings_udf` (#12099)
brandon-b-miller Nov 17, 2022
2f2685f
Allow setting malloc heap size in string udfs (#12094)
brandon-b-miller Nov 17, 2022
db0d045
Ensure dlpack include is provided to cudf interop lib (#12139)
robertmaynard Nov 17, 2022
ec8888c
fix selection of original vs compressed blocks, padding
vuule Nov 18, 2022
e29ea84
style
vuule Nov 18, 2022
3fb09d1
Implement chunked Parquet reader (#11867)
ttnghia Nov 18, 2022
6d2a4f0
Add wheel builds (#12096)
vyasr Nov 18, 2022
cc4b4dd
Don't use CMake 3.25.0 as it has a show stopping FindCUDAToolkit bug …
robertmaynard Nov 18, 2022
30bc05c
Merge branch 'branch-22.12' of https://github.com/rapidsai/cudf into …
vuule Nov 18, 2022
cbd07a5
Merge branch-22.10 into branch-22.12
davidwendt Nov 18, 2022
3c94071
Merge pull request #12198 from davidwendt/branch-22.12-merge-22.10
ajschmidt8 Nov 18, 2022
a2f69e4
Reduce number of tests marked `spilling` (#12197)
madsbk Nov 18, 2022
782fba3
Implement JNI for chunked Parquet reader (#11961)
ttnghia Nov 18, 2022
c79c2d1
Merge branch 'branch-22.12' of https://github.com/rapidsai/cudf into …
vuule Nov 18, 2022
08c0c5a
comment
vuule Nov 18, 2022
9292b50
Merge branch 'branch-22.12' of https://github.com/rapidsai/cudf into …
vuule Nov 18, 2022
21ba312
Fix dask backend dispatch (#12203)
galipremsagar Nov 18, 2022
a8afc75
fix is_data_empty
vuule Nov 19, 2022
769dfbb
Merge pull request #12194 from vuule/bug-write_orc-compressission
jolorunyomi Nov 21, 2022
e670c10
remove assert; separate empty stripe and level
vuule Nov 21, 2022
cd6dff3
Workaround for CUB segmented-sort bug with boolean keys
davidwendt Nov 21, 2022
6756b02
Merge branch 'branch-22.12' of https://github.com/rapidsai/cudf into …
vuule Nov 22, 2022
f15080f
test
vuule Nov 22, 2022
49f983d
Merge pull request #12217 from davidwendt/bug-cub-segmented-sort
jolorunyomi Nov 22, 2022
ed35f67
Merge pull request #12160 from vuule/bug-read_orc-empty-map-column
jolorunyomi Nov 22, 2022
0c60819
Make dask pinning looser (#12231)
vyasr Nov 23, 2022
c83ff55
Fix include line for io/numpy.
vyasr Nov 28, 2022
eb27104
Merge pull request #12250 from vyasr/fix/io_numpy_link
AyodeAwe Nov 29, 2022
fc2ec42
merge
galipremsagar Dec 1, 2022
297911f
Pin to 2022.11.1
galipremsagar Dec 1, 2022
cbdefb8
Merge branch 'pin_dask' of https://github.com/galipremsagar/cudf into…
galipremsagar Dec 1, 2022
9cd9841
Merge pull request #12165 from galipremsagar/pin_dask
AyodeAwe Dec 2, 2022
f471bcc
update changelog
raydouglass Dec 8, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
cuDF (Python):
- 'python/**'
- 'notebooks/**'

libcudf:
- 'cpp/**'

CMake:
- '**/CMakeLists.txt'
- '**/cmake/**'

cuDF (Java):
- 'java/**'

Expand Down
20 changes: 20 additions & 0 deletions .github/workflows/add_to_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Add new issue/PR to project

on:
issues:
types:
- opened

pull_request_target:
types:
- opened

jobs:
add-to-project:
name: Add issue or PR to project
runs-on: ubuntu-latest
steps:
- uses: actions/add-to-project@v0.3.0
with:
project-url: https://github.com/orgs/rapidsai/projects/51
github-token: ${{ secrets.ADD_TO_PROJECT_GITHUB_TOKEN }}
12 changes: 12 additions & 0 deletions .github/workflows/dependency-files.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: pr

on:
pull_request:

jobs:
checks:
secrets: inherit
uses: rapidsai/shared-action-workflows/.github/workflows/checks.yaml@main
with:
enable_check_size: false
enable_check_style: false
57 changes: 0 additions & 57 deletions .github/workflows/stale.yaml

This file was deleted.

77 changes: 77 additions & 0 deletions .github/workflows/wheels.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
name: cuDF wheels

on:
workflow_call:
inputs:
versioneer-override:
type: string
default: ''
build-tag:
type: string
default: ''
branch:
required: true
type: string
date:
required: true
type: string
sha:
required: true
type: string
build-type:
type: string
default: nightly

concurrency:
group: "cudf-${{ github.workflow }}-${{ github.ref }}"
cancel-in-progress: true

jobs:
cudf-wheels:
uses: rapidsai/shared-action-workflows/.github/workflows/wheels-manylinux.yml@main
with:
repo: rapidsai/cudf

build-type: ${{ inputs.build-type }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}

package-dir: python/cudf
package-name: cudf

python-package-versioneer-override: ${{ inputs.versioneer-override }}
python-package-build-tag: ${{ inputs.build-tag }}

skbuild-configure-options: "-DCUDF_BUILD_WHEELS=ON -DDETECT_CONDA_ENV=OFF"

test-extras: test

# Have to manually specify the cupy install location on arm.
# Have to also manually install tokenizers==0.10.2, which is the last tokenizers
# to have a binary aarch64 wheel available on PyPI
# Otherwise, the tokenizers sdist is used, which needs a Rust compiler
test-before-arm64: "pip install tokenizers==0.10.2 cupy-cuda11x -f https://pip.cupy.dev/aarch64"

test-unittest: "pytest -v -n 8 ./python/cudf/cudf/tests"
secrets: inherit
dask_cudf-wheel:
needs: cudf-wheels
uses: rapidsai/shared-action-workflows/.github/workflows/wheels-pure.yml@main
with:
repo: rapidsai/cudf

build-type: ${{ inputs.build-type }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}

package-dir: python/dask_cudf
package-name: dask_cudf

python-package-versioneer-override: ${{ inputs.versioneer-override }}
python-package-build-tag: ${{ inputs.build-tag }}

test-extras: test
test-unittest: "pytest -v -n 8 ./python/dask_cudf/dask_cudf/tests"
secrets: inherit
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ junit-cudf.xml
test-results

## Patching
*.diff
*.orig
*.rej

Expand Down Expand Up @@ -166,3 +165,8 @@ dask-worker-space/
# Sphinx docs & build artifacts
docs/cudf/source/api_docs/generated/*
docs/cudf/source/api_docs/api/*
docs/cudf/source/user_guide/example_output/*
docs/cudf/source/user_guide/cudf.*Dtype.*.rst

# cibuildwheel
/wheelhouse
33 changes: 31 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
# Copyright (c) 2019-2022, NVIDIA CORPORATION.

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: trailing-whitespace
exclude: |
(?x)^(
^python/cudf/cudf/tests/data/subword_tokenizer_data/.*
)
- id: end-of-file-fixer
exclude: |
(?x)^(
^python/cudf/cudf/tests/data/subword_tokenizer_data/.*
)
- repo: https://github.com/PyCQA/isort
rev: 5.10.1
hooks:
Expand All @@ -18,12 +31,18 @@ repos:
# Explicitly specify the pyproject.toml at the repo root, not per-project.
args: ["--config", "pyproject.toml"]
- repo: https://github.com/PyCQA/flake8
rev: 3.8.3
rev: 5.0.4
hooks:
- id: flake8
args: ["--config=setup.cfg"]
files: python/.*\.(py|pyx|pxd)$
files: python/.*$
types: [file]
types_or: [python, cython]
additional_dependencies: ["flake8-force"]
- repo: https://github.com/MarcoGorelli/cython-lint
rev: v0.1.10
hooks:
- id: cython-lint
- repo: https://github.com/pre-commit/mirrors-mypy
rev: 'v0.971'
hooks:
Expand All @@ -46,6 +65,16 @@ repos:
- id: clang-format
types_or: [c, c++, cuda]
args: ["-fallback-style=none", "-style=file", "-i"]
- repo: https://github.com/sirosen/texthooks
rev: 0.4.0
hooks:
- id: fix-smartquotes
exclude: |
(?x)^(
^cpp/include/cudf_test/cxxopts.hpp|
^python/cudf/cudf/tests/data/subword_tokenizer_data/.*|
^python/cudf/cudf/tests/test_text.py
)
- repo: local
hooks:
- id: no-deprecationwarning
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# cuDF 22.12.00 (Date TBD)

Please see https://github.com/rapidsai/cudf/releases/tag/v22.12.00a for the latest changes to this development branch.

# cuDF 22.10.00 (12 Oct 2022)

## 🚨 Breaking Changes
Expand Down
1 change: 0 additions & 1 deletion CODE_OF_CONDUCT.md

This file was deleted.

9 changes: 3 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,13 +99,13 @@ cd $CUDF_HOME
**Note:** Using a conda environment is the easiest way to satisfy the library's dependencies.
Instructions for a minimal build environment without conda are included below.

- Create the conda development environment `cudf_dev`:
- Create the conda development environment:

```bash
# create the conda environment (assuming in base `cudf` directory)
# note: RAPIDS currently doesn't support `channel_priority: strict`;
# use `channel_priority: flexible` instead
conda env create --name cudf_dev --file conda/environments/cudf_dev_cuda11.5.yml
conda env create --name cudf_dev --file conda/environments/all_cuda-115_arch-x86_64.yaml
# activate the environment
conda activate cudf_dev
```
Expand All @@ -114,9 +114,6 @@ conda activate cudf_dev
development environment may also need to be updated if dependency versions or
pinnings are changed.

- For other CUDA versions, check the corresponding `cudf_dev_cuda*.yml` file in
`conda/environments/`.

#### Building without a conda environment

- libcudf has the following minimal dependencies (in addition to those listed in the [General
Expand Down Expand Up @@ -382,7 +379,7 @@ You can skip these checks with `git commit --no-verify` or with the short versio

## Developer Guidelines

The [C++ Developer Guide](cpp/docs/DEVELOPER_GUIDE.md) includes details on contributing to libcudf C++ code.
The [C++ Developer Guide](cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md) includes details on contributing to libcudf C++ code.

The [Python Developer Guide](https://docs.rapids.ai/api/cudf/stable/developer_guide/index.html) includes details on contributing to cuDF Python code.

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ For additional examples, browse our complete [API documentation](https://docs.ra

## Quick Start

Please see the [Demo Docker Repository](https://hub.docker.com/r/rapidsai/rapidsai/), choosing a tag based on the NVIDIA CUDA version youre running. This provides a ready to run Docker container with example notebooks and data, showcasing how you can utilize cuDF.
Please see the [Demo Docker Repository](https://hub.docker.com/r/rapidsai/rapidsai/), choosing a tag based on the NVIDIA CUDA version you're running. This provides a ready to run Docker container with example notebooks and data, showcasing how you can utilize cuDF.

## Installation

Expand Down
6 changes: 3 additions & 3 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ BUILD_BENCHMARKS=OFF
BUILD_ALL_GPU_ARCH=0
BUILD_NVTX=ON
BUILD_TESTS=OFF
BUILD_DISABLE_DEPRECATION_WARNING=ON
BUILD_DISABLE_DEPRECATION_WARNINGS=ON
BUILD_PER_THREAD_DEFAULT_STREAM=OFF
BUILD_REPORT_METRICS=OFF
BUILD_REPORT_INCL_CACHE_STATS=OFF
Expand Down Expand Up @@ -216,7 +216,7 @@ if hasArg --opensource_nvcomp; then
USE_PROPRIETARY_NVCOMP="OFF"
fi
if hasArg --show_depr_warn; then
BUILD_DISABLE_DEPRECATION_WARNING=OFF
BUILD_DISABLE_DEPRECATION_WARNINGS=OFF
fi
if hasArg --ptds; then
BUILD_PER_THREAD_DEFAULT_STREAM=ON
Expand Down Expand Up @@ -285,7 +285,7 @@ if buildAll || hasArg libcudf; then
-DCUDF_USE_PROPRIETARY_NVCOMP=${USE_PROPRIETARY_NVCOMP} \
-DBUILD_TESTS=${BUILD_TESTS} \
-DBUILD_BENCHMARKS=${BUILD_BENCHMARKS} \
-DDISABLE_DEPRECATION_WARNING=${BUILD_DISABLE_DEPRECATION_WARNING} \
-DDISABLE_DEPRECATION_WARNINGS=${BUILD_DISABLE_DEPRECATION_WARNINGS} \
-DCUDF_USE_PER_THREAD_DEFAULT_STREAM=${BUILD_PER_THREAD_DEFAULT_STREAM} \
-DCMAKE_BUILD_TYPE=${BUILD_TYPE} \
${EXTRA_CMAKE_ARGS}
Expand Down
6 changes: 3 additions & 3 deletions ci/benchmark/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ export GBENCH_BENCHMARKS_DIR="$WORKSPACE/cpp/build/gbenchmarks/"
export LIBCUDF_KERNEL_CACHE_PATH="$HOME/.jitify-cache"

# Dask & Distributed option to install main(nightly) or `conda-forge` packages.
export INSTALL_DASK_MAIN=0
export INSTALL_DASK_MAIN=1

# Dask version to install when `INSTALL_DASK_MAIN=0`
export DASK_STABLE_VERSION="2022.9.2"
Expand Down Expand Up @@ -82,8 +82,8 @@ conda install "rmm=$MINOR_VERSION.*" "cudatoolkit=$CUDA_REL" \

# Install the conda-forge or nightly version of dask and distributed
if [[ "${INSTALL_DASK_MAIN}" == 1 ]]; then
gpuci_logger "gpuci_mamba_retry update dask"
gpuci_mamba_retry update dask
gpuci_logger "gpuci_mamba_retry install -c dask/label/dev 'dask/label/dev::dask' 'dask/label/dev::distributed'"
gpuci_mamba_retry install -c dask/label/dev "dask/label/dev::dask" "dask/label/dev::distributed"
else
gpuci_logger "gpuci_mamba_retry install conda-forge::dask=={$DASK_STABLE_VERSION} conda-forge::distributed=={$DASK_STABLE_VERSION} conda-forge::dask-core=={$DASK_STABLE_VERSION} --force-reinstall"
gpuci_mamba_retry install conda-forge::dask=={$DASK_STABLE_VERSION} conda-forge::distributed=={$DASK_STABLE_VERSION} conda-forge::dask-core=={$DASK_STABLE_VERSION} --force-reinstall
Expand Down
Loading