Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Optimizations for cudf.concat when axis=1 #9333

Merged
merged 64 commits into from
Oct 19, 2021

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Sep 29, 2021

Fixes: #9223, #9200, #9411

This PR:

  • Reduces memory pressure by avoiding index materialization incase of RangeIndex when axis=1.
  • Fixes the correctness of all axis=1 cases in cudf.concat, and thus enabling stricter index type checks in associated pytests.
  • Cache distinct_count value of Column in _distinct_count to improve performance.
  • Introduced Column._clear_cache to have a single method that clears all the caches values related to a Column.
  • Implemented Index.union, Index.intersection & Index.has_duplicates.
  • Implemented is_numeric, is_boolean, is_integer, is_floating, is_object, is_categorical& is_interval APIs in Index.
  • Optimizes cudf.concat for axis=1 by utilizing above mentioned changes, here are benchmarks:
------------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs0]': 2 tests -------------------------------------------------------------------------------
Name (time in us)                                                 Min                   Max                  Mean              StdDev                Median                IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs0] (THIS-PR)            209.9802 (1.0)      2,429.9941 (1.0)        222.9479 (1.0)       41.3467 (1.0)        224.5191 (1.0)      12.1914 (1.81)        12;32  4,485.3529 (1.0)        2985           1
test_concat_axis_1[False-inner-1-objs0] (branch-21.12)     1,807.7570 (8.61)     5,023.1239 (2.07)     1,868.9510 (8.38)     246.0487 (5.95)     1,830.1200 (8.15)      6.7296 (1.0)         20;74    535.0595 (0.12)        520           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs1]': 2 tests ----------------------------------------------------------------------
Name (time in ms)                                              Min                Max               Mean            StdDev             Median               IQR            Outliers      OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs1] (THIS-PR)          19.3856 (1.0)      25.1846 (1.0)      19.7466 (1.0)      0.9687 (13.33)    19.5381 (1.0)      0.2784 (6.09)          2;2  50.6416 (1.0)          50           1
test_concat_axis_1[False-inner-1-objs1] (branch-21.12)     30.7169 (1.58)     31.1239 (1.24)     30.7672 (1.56)     0.0727 (1.0)      30.7480 (1.57)     0.0457 (1.0)           2;1  32.5021 (0.64)         33           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs2]': 2 tests ----------------------------------------------------------------------
Name (time in ms)                                              Min                Max               Mean            StdDev             Median               IQR            Outliers      OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs2] (THIS-PR)          19.4794 (1.0)      20.0249 (1.0)      19.5933 (1.0)      0.1462 (1.0)      19.5117 (1.0)      0.1412 (1.07)         10;4  51.0378 (1.0)          51           1
test_concat_axis_1[False-inner-1-objs2] (branch-21.12)     30.8203 (1.58)     31.9644 (1.60)     30.9485 (1.58)     0.1959 (1.34)     30.9026 (1.58)     0.1319 (1.0)           1;1  32.3118 (0.63)         33           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs3]': 2 tests ----------------------------------------------------------------------
Name (time in ms)                                              Min                Max               Mean            StdDev             Median               IQR            Outliers       OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs3] (THIS-PR)           1.2168 (1.0)       3.3944 (1.0)       1.2505 (1.0)      0.0893 (1.0)       1.2349 (1.0)      0.0388 (1.0)         15;23  799.6555 (1.0)         707           1
test_concat_axis_1[False-inner-1-objs3] (branch-21.12)     44.4625 (36.54)    45.9180 (13.53)    45.1017 (36.07)    0.3472 (3.89)     45.1007 (36.52)    0.4618 (11.90)         7;0   22.1721 (0.03)         23           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs4]': 2 tests ------------------------------------------------------------------------
Name (time in ms)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers      OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs4] (branch-21.12)      95.7450 (1.0)       97.5205 (1.0)       96.5405 (1.0)      0.5931 (1.13)      96.5431 (1.0)      1.0256 (1.17)          4;0  10.3583 (1.0)          11           1
test_concat_axis_1[False-inner-1-objs4] (THIS-PR)          106.3069 (1.11)     107.8606 (1.11)     107.0745 (1.11)     0.5239 (1.0)      107.0633 (1.11)     0.8757 (1.0)           3;0   9.3393 (0.90)         10           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs5]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs5] (branch-21.12)     276.2022 (1.0)      278.3065 (1.0)      277.3080 (1.0)      0.9845 (1.0)      277.5305 (1.0)      1.8682 (1.03)          2;0  3.6061 (1.0)           5           1
test_concat_axis_1[False-inner-1-objs5] (THIS-PR)          304.1699 (1.10)     307.0704 (1.10)     305.4101 (1.10)     1.1629 (1.18)     305.2463 (1.10)     1.8148 (1.0)           2;0  3.2743 (0.91)          5           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs6]': 2 tests ------------------------------------------------------------------------------
Name (time in us)                                                 Min                   Max                  Mean             StdDev                Median                IQR            Outliers         OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs6] (THIS-PR)            554.7500 (1.0)        669.7820 (1.0)        566.2571 (1.0)      13.3221 (1.0)        561.7749 (1.0)       5.2570 (1.0)         85;94  1,765.9823 (1.0)         748           1
test_concat_axis_1[False-inner-1-objs6] (branch-21.12)     3,956.2921 (7.13)     4,395.6251 (6.56)     4,015.7610 (7.09)     66.9272 (5.02)     3,993.7040 (7.11)     76.8616 (14.62)        28;8    249.0188 (0.14)        241           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-inner-1-objs7]': 2 tests ----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max               Mean            StdDev             Median               IQR            Outliers      OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-inner-1-objs7] (THIS-PR)          72.6492 (1.0)       74.1472 (1.0)      73.3672 (1.0)      0.4783 (1.0)      73.4728 (1.0)      0.7316 (1.0)           5;0  13.6301 (1.0)          14           1
test_concat_axis_1[False-inner-1-objs7] (branch-21.12)     98.6850 (1.36)     100.1399 (1.35)     99.5267 (1.36)     0.6551 (1.37)     99.9600 (1.36)     1.1940 (1.63)          4;0  10.0476 (0.74)         10           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs0]': 2 tests ---------------------------------------------------------------------------
Name (time in us)                                               Min                 Max                Mean            StdDev              Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs0] (branch-21.12)     213.2710 (1.0)      275.2030 (1.0)      223.5803 (1.01)     7.3814 (1.15)     222.9400 (1.02)     12.9229 (5.86)       719;17        4.4727 (0.99)       2875           1
test_concat_axis_1[False-outer-1-objs0] (THIS-PR)          214.6652 (1.01)     290.9640 (1.06)     220.4459 (1.0)      6.4177 (1.0)      218.0159 (1.0)       2.2046 (1.0)       419;512        4.5363 (1.0)        2731           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs1]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs1] (THIS-PR)          140.9027 (1.0)      141.7782 (1.0)      141.4213 (1.0)      0.3324 (1.0)      141.4934 (1.0)      0.5372 (1.0)           4;0  7.0711 (1.0)           8           1
test_concat_axis_1[False-outer-1-objs1] (branch-21.12)     174.4978 (1.24)     175.9156 (1.24)     174.9014 (1.24)     0.5408 (1.63)     174.6700 (1.23)     0.5511 (1.03)          1;0  5.7175 (0.81)          6           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs2]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs2] (THIS-PR)          149.0907 (1.0)      151.3939 (1.0)      149.6573 (1.0)      0.8207 (5.56)     149.2920 (1.0)      0.6782 (3.85)          1;1  6.6819 (1.0)           7           1
test_concat_axis_1[False-outer-1-objs2] (branch-21.12)     183.9202 (1.23)     184.3218 (1.22)     184.0712 (1.23)     0.1477 (1.0)      184.0646 (1.23)     0.1760 (1.0)           2;0  5.4327 (0.81)          6           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs3]': 2 tests --------------------------------------------------------------------
Name (time in ms)                                             Min               Max              Mean            StdDev            Median               IQR            Outliers       OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs3] (THIS-PR)          1.1996 (1.0)      1.6017 (1.0)      1.2270 (1.00)     0.0297 (1.45)     1.2170 (1.0)      0.0374 (3.69)        29;13  815.0022 (1.00)        719           1
test_concat_axis_1[False-outer-1-objs3] (branch-21.12)     1.2096 (1.01)     1.6363 (1.02)     1.2259 (1.0)      0.0205 (1.0)      1.2199 (1.00)     0.0102 (1.0)        88;106  815.7473 (1.0)         762           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs4]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs4] (THIS-PR)          582.8973 (1.0)      586.0131 (1.0)      583.9782 (1.0)      1.2053 (1.0)      583.5081 (1.0)      1.2076 (1.0)           1;0  1.7124 (1.0)           5           1
test_concat_axis_1[False-outer-1-objs4] (branch-21.12)     785.9871 (1.35)     790.6360 (1.35)     787.4976 (1.35)     1.8293 (1.52)     786.8087 (1.35)     1.7791 (1.47)          1;0  1.2698 (0.74)          5           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs5]': 2 tests -------------------------------------------------------------------
Name (time in s)                                              Min               Max              Mean            StdDev            Median               IQR            Outliers     OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs5] (THIS-PR)          1.9260 (1.0)      1.9343 (1.0)      1.9299 (1.0)      0.0031 (1.0)      1.9299 (1.0)      0.0038 (1.0)           2;0  0.5182 (1.0)           5           1
test_concat_axis_1[False-outer-1-objs5] (branch-21.12)     2.1733 (1.13)     2.1830 (1.13)     2.1777 (1.13)     0.0039 (1.26)     2.1784 (1.13)     0.0058 (1.53)          2;0  0.4592 (0.89)          5           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs6]': 2 tests ---------------------------------------------------------------------------
Name (time in us)                                               Min                 Max                Mean             StdDev              Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs6] (THIS-PR)          554.3760 (1.0)      632.9010 (1.02)     575.7529 (1.02)     16.1334 (2.40)     566.0525 (1.00)     31.2359 (7.54)        545;0        1.7369 (0.98)       1442           1
test_concat_axis_1[False-outer-1-objs6] (branch-21.12)     556.5900 (1.00)     622.5759 (1.0)      566.3433 (1.0)       6.7226 (1.0)      564.7328 (1.0)       4.1408 (1.0)        114;89        1.7657 (1.0)        1497           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[False-outer-1-objs7]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[False-outer-1-objs7] (THIS-PR)          596.3256 (1.0)      600.5619 (1.0)      597.9632 (1.0)      1.6437 (1.0)      597.7408 (1.0)      2.1454 (1.0)           1;0  1.6723 (1.0)           5           1
test_concat_axis_1[False-outer-1-objs7] (branch-21.12)     654.1722 (1.10)     666.8746 (1.11)     657.2377 (1.10)     5.4777 (3.33)     654.3422 (1.09)     4.8897 (2.28)          1;1  1.5215 (0.91)          5           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs0]': 2 tests -------------------------------------------------------------------------------
Name (time in us)                                                Min                   Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs0] (THIS-PR)            222.4192 (1.0)        312.4340 (1.0)        233.9587 (1.0)       12.3266 (1.0)        226.9410 (1.0)       17.2716 (1.0)        150;17  4,274.2589 (1.0)         896           1
test_concat_axis_1[True-inner-1-objs0] (branch-21.12)     1,831.1338 (8.23)     5,528.5210 (17.70)    2,174.9929 (9.30)     411.6862 (33.40)    2,110.6380 (9.30)     890.1195 (51.54)        77;1    459.7716 (0.11)        293           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs1]': 2 tests ---------------------------------------------------------------------
Name (time in ms)                                             Min                Max               Mean            StdDev             Median               IQR            Outliers      OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs1] (THIS-PR)          19.3491 (1.0)      23.9291 (1.0)      20.4857 (1.0)      1.4031 (13.72)    19.5300 (1.0)      2.5649 (19.47)        14;0  48.8145 (1.0)          40           1
test_concat_axis_1[True-inner-1-objs1] (branch-21.12)     30.9140 (1.60)     31.3545 (1.31)     31.0313 (1.51)     0.1023 (1.0)      31.0049 (1.59)     0.1318 (1.0)           6;1  32.2255 (0.66)         30           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs2]': 2 tests ---------------------------------------------------------------------
Name (time in ms)                                             Min                Max               Mean            StdDev             Median               IQR            Outliers      OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs2] (THIS-PR)          19.3977 (1.0)      22.6105 (1.0)      19.6793 (1.0)      0.6127 (1.0)      19.5005 (1.0)      0.2517 (1.0)           3;3  50.8148 (1.0)          49           1
test_concat_axis_1[True-inner-1-objs2] (branch-21.12)     31.0002 (1.60)     37.2946 (1.65)     31.4314 (1.60)     1.1519 (1.88)     31.1185 (1.60)     0.2629 (1.04)          2;3  31.8153 (0.63)         32           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs3]': 2 tests ----------------------------------------------------------------------
Name (time in ms)                                             Min                Max               Mean            StdDev             Median               IQR            Outliers       OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs3] (THIS-PR)           1.2086 (1.0)       3.2895 (1.0)       1.2670 (1.0)      0.0809 (1.0)       1.2712 (1.0)      0.0247 (1.0)          6;42  789.2781 (1.0)         685           1
test_concat_axis_1[True-inner-1-objs3] (branch-21.12)     44.0268 (36.43)    45.0905 (13.71)    44.4070 (35.05)    0.2370 (2.93)     44.3967 (34.92)    0.2955 (11.95)         6;1   22.5190 (0.03)         24           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs4]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers      OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs4] (branch-21.12)      94.6051 (1.0)       96.7158 (1.0)       95.3382 (1.0)      0.5723 (1.59)      95.1666 (1.0)      0.5416 (1.0)           3;1  10.4890 (1.0)          11           1
test_concat_axis_1[True-inner-1-objs4] (THIS-PR)          104.9262 (1.11)     105.8423 (1.09)     105.3436 (1.10)     0.3590 (1.0)      105.2455 (1.11)     0.5744 (1.06)          2;0   9.4927 (0.91)          6           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs5]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs5] (branch-21.12)     273.0914 (1.0)      273.9324 (1.0)      273.4240 (1.0)      0.3789 (1.0)      273.1949 (1.0)      0.6226 (1.0)           1;0  3.6573 (1.0)           5           1
test_concat_axis_1[True-inner-1-objs5] (THIS-PR)          298.2814 (1.09)     300.4248 (1.10)     299.5427 (1.10)     0.8678 (2.29)     299.7728 (1.10)     1.3431 (2.16)          2;0  3.3384 (0.91)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs6]': 2 tests -------------------------------------------------------------------------------
Name (time in us)                                                Min                   Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs6] (THIS-PR)            560.6860 (1.0)        664.2400 (1.0)        586.7618 (1.0)       17.0820 (1.0)        596.3098 (1.0)       31.9778 (1.0)         605;3  1,704.2692 (1.0)        1399           1
test_concat_axis_1[True-inner-1-objs6] (branch-21.12)     3,963.3820 (7.07)     7,186.5108 (10.82)    4,081.5076 (6.96)     322.1541 (18.86)    4,015.4392 (6.73)     120.8268 (3.78)          5;9    245.0075 (0.14)        229           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------ benchmark 'bench_concat.py::test_concat_axis_1[True-inner-1-objs7]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers      OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-inner-1-objs7] (THIS-PR)           72.7404 (1.0)       74.9232 (1.0)       73.5822 (1.0)      0.7077 (4.98)      73.7620 (1.0)      1.0375 (9.03)          5;0  13.5903 (1.0)          13           1
test_concat_axis_1[True-inner-1-objs7] (branch-21.12)     100.0205 (1.38)     100.4437 (1.34)     100.1622 (1.36)     0.1422 (1.0)      100.1149 (1.36)     0.1149 (1.0)           2;2   9.9838 (0.73)         10           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs0]': 2 tests ----------------------------------------------------------------------------
Name (time in us)                                              Min                   Max                Mean             StdDev              Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs0] (THIS-PR)          227.8399 (1.0)      3,832.8250 (13.42)    250.7014 (1.05)     70.9327 (17.37)    255.2045 (1.07)     23.8101 (10.25)         5;8        3.9888 (0.96)       2684           1
test_concat_axis_1[True-outer-1-objs0] (branch-21.12)     235.2530 (1.03)       285.5239 (1.0)      239.8447 (1.0)       4.0831 (1.0)      238.6939 (1.0)       2.3230 (1.0)       243;256        4.1694 (1.0)        2670           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs1]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs1] (THIS-PR)          141.7261 (1.0)      145.9042 (1.0)      142.7198 (1.0)      1.7906 (18.95)    141.9498 (1.0)      1.3718 (10.05)         1;1  7.0067 (1.0)           5           1
test_concat_axis_1[True-outer-1-objs1] (branch-21.12)     175.0254 (1.23)     175.2591 (1.20)     175.1198 (1.23)     0.0945 (1.0)      175.0752 (1.23)     0.1364 (1.0)           1;0  5.7104 (0.81)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs2]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs2] (THIS-PR)          149.5332 (1.0)      150.3494 (1.0)      149.9293 (1.0)      0.2652 (1.0)      149.8476 (1.0)      0.3105 (1.0)           2;0  6.6698 (1.0)           7           1
test_concat_axis_1[True-outer-1-objs2] (branch-21.12)     183.8074 (1.23)     184.6288 (1.23)     184.2467 (1.23)     0.3398 (1.28)     184.2170 (1.23)     0.5747 (1.85)          3;0  5.4275 (0.81)          6           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs3]': 2 tests --------------------------------------------------------------------
Name (time in ms)                                            Min               Max              Mean            StdDev            Median               IQR            Outliers       OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs3] (THIS-PR)          1.2082 (1.0)      1.9830 (1.44)     1.2325 (1.0)      0.0367 (2.09)     1.2202 (1.0)      0.0377 (1.70)         21;5  811.3756 (1.0)         696           1
test_concat_axis_1[True-outer-1-objs3] (branch-21.12)     1.2231 (1.01)     1.3767 (1.0)      1.2394 (1.01)     0.0176 (1.0)      1.2321 (1.01)     0.0221 (1.0)        160;12  806.8524 (0.99)        727           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs4]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs4] (THIS-PR)          574.2238 (1.0)      576.4085 (1.0)      575.5754 (1.0)      0.8308 (1.0)      575.7421 (1.0)      0.9577 (1.0)           2;0  1.7374 (1.0)           5           1
test_concat_axis_1[True-outer-1-objs4] (branch-21.12)     770.7027 (1.34)     772.6688 (1.34)     771.6322 (1.34)     0.9549 (1.15)     771.0687 (1.34)     1.6949 (1.77)          2;0  1.2960 (0.75)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs5]': 2 tests -------------------------------------------------------------------
Name (time in s)                                             Min               Max              Mean            StdDev            Median               IQR            Outliers     OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs5] (THIS-PR)          1.9025 (1.0)      1.9095 (1.0)      1.9074 (1.0)      0.0028 (1.0)      1.9082 (1.0)      0.0023 (1.0)           1;1  0.5243 (1.0)           5           1
test_concat_axis_1[True-outer-1-objs5] (branch-21.12)     2.1330 (1.12)     2.1428 (1.12)     2.1374 (1.12)     0.0039 (1.42)     2.1375 (1.12)     0.0062 (2.75)          2;0  0.4679 (0.89)          5           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs6]': 2 tests --------------------------------------------------------------------------
Name (time in us)                                              Min                 Max                Mean             StdDev              Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs6] (branch-21.12)     558.6550 (1.0)      641.9669 (1.0)      570.2701 (1.0)      11.1347 (1.0)      566.8140 (1.0)      5.0180 (1.0)       141;153        1.7536 (1.0)        1498           1
test_concat_axis_1[True-outer-1-objs6] (THIS-PR)          563.2618 (1.01)     663.0530 (1.03)     594.9855 (1.04)     15.4747 (1.39)     600.2941 (1.06)     8.7381 (1.74)      399;373        1.6807 (0.96)       1394           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------- benchmark 'bench_concat.py::test_concat_axis_1[True-outer-1-objs7]': 2 tests -----------------------------------------------------------------------
Name (time in ms)                                              Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_concat_axis_1[True-outer-1-objs7] (THIS-PR)          597.2443 (1.0)      600.4502 (1.0)      598.6500 (1.0)      1.4581 (3.99)     598.3555 (1.0)      2.6978 (4.06)          1;0  1.6704 (1.0)           5           1
test_concat_axis_1[True-outer-1-objs7] (branch-21.12)     653.1495 (1.09)     653.9721 (1.09)     653.5529 (1.09)     0.3653 (1.0)      653.4739 (1.09)     0.6643 (1.0)           2;0  1.5301 (0.92)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Associated benchmarks are being added here: vyasr/cudf_benchmarks#1

@github-actions github-actions bot added the Python Affects Python cuDF API. label Sep 29, 2021
@codecov
Copy link

codecov bot commented Sep 29, 2021

Codecov Report

Merging #9333 (45aa26a) into branch-21.12 (ab4bfaa) will decrease coverage by 0.21%.
The diff coverage is 0.00%.

❗ Current head 45aa26a differs from pull request most recent head 47ce5d1. Consider uploading reports for the commit 47ce5d1 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9333      +/-   ##
================================================
- Coverage         10.79%   10.57%   -0.22%     
================================================
  Files               116      116              
  Lines             18869    19388     +519     
================================================
+ Hits               2036     2051      +15     
- Misses            16833    17337     +504     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/strings/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/csv.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/orc.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/series.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/reshape.py 0.00% <0.00%> (ø)
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 794863c...47ce5d1. Read the comment docs.

@galipremsagar galipremsagar added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 11, 2021
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer. I have a few suggestions and there's one discussion that we may need to continue a bit offline.

python/cudf/cudf/_lib/column.pyx Outdated Show resolved Hide resolved
python/cudf/cudf/_lib/column.pyx Outdated Show resolved Hide resolved
python/cudf/cudf/core/column/column.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/index.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/multiindex.py Show resolved Hide resolved
python/cudf/cudf/core/_base_index.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/multiindex.py Outdated Show resolved Hide resolved
@galipremsagar galipremsagar added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Oct 12, 2021
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor last changes, then I think this is ready to go (pending SWIPAT).

python/cudf/cudf/core/index.py Show resolved Hide resolved
python/cudf/cudf/core/index.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/index.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/index.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/index.py Outdated Show resolved Hide resolved
@galipremsagar galipremsagar removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Oct 18, 2021
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@quasiben
Copy link
Member

With @vyasr review I think we should be good. Merging in now

@quasiben
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit a19bd23 into rapidsai:branch-21.12 Oct 19, 2021
@galipremsagar
Copy link
Contributor Author

Thanks @vyasr for patiently reviewing a lengthy PR like this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Cache distinct_count in Column because of performance issues [FEA] Index.union support in cudf
3 participants