Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spillable cache for GpuCartesianRDD #1784

Closed
wants to merge 39 commits into from

Commits on Feb 22, 2021

  1. spillable cache for GpuCartesianRDD

    Signed-off-by: sperlingxx <lovedreamf@gmail.com>
    sperlingxx committed Feb 22, 2021
    Configuration menu
    Copy the full SHA
    86fc104 View commit details
    Browse the repository at this point in the history

Commits on Feb 23, 2021

  1. lazy cache

    sperlingxx committed Feb 23, 2021
    Configuration menu
    Copy the full SHA
    07b2d15 View commit details
    Browse the repository at this point in the history

Commits on Mar 1, 2021

  1. Update cudf dependency to 0.18 (NVIDIA#1828)

    * Depend on the cuDF v0.18
    
    Change rapids brannch-0.4 to depend on cuDF v0.18 release jars
    
    Prepare for the for the rapids v0.4.0 release
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    
    * cudf 0.17-SNAPSHOT to 0.17
    NvTimLiu authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    72b2e12 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    653c33a View commit details
    Browse the repository at this point in the history
  3. Update mortgage tests to support reading multiple dataset formats (NV…

    …IDIA#1808)
    
    * mortgage support multiple dataset formats
    
    change mortgage sample class to support dataset formats csv/orc/parquet
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    
    * Update
    
    1, copyright 2021
    2, throw an error if there are more than 5 arguments
    3, match-case optimize
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    
    * Update
    
    1, print some helpful info for the input arguments
    2, exit instead of exeption, when arguments are wrongly set
    
    * fix typo
    
    * Fix Nothing value in 'case _ =>'
    
    * update
    NvTimLiu authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    bb03535 View commit details
    Browse the repository at this point in the history
  4. Remove benchmarks (NVIDIA#1826)

    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    jlowe authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    17657fe View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c40ec37 View commit details
    Browse the repository at this point in the history
  6. Merge pull request NVIDIA#1835 from jlowe/fix-merge

    Fix merge conflict with branch-0.4
    jlowe authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    50fd165 View commit details
    Browse the repository at this point in the history
  7. Spark 3.0.2 shim no longer a snapshot shim (NVIDIA#1831)

    * Spark 3.0.2 shim no longer a snapshot shim
    
    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    
    * Remove 3.0.2-SNAPSHOT support
    jlowe authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    6483543 View commit details
    Browse the repository at this point in the history
  8. Merge pull request NVIDIA#1837 from NVIDIA/branch-0.4

    [auto-merge] branch-0.4 to branch-0.5 [skip ci] [bot]
    nvauto authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    c52e9a5 View commit details
    Browse the repository at this point in the history
  9. Make databricks build.sh more convenient for dev (NVIDIA#1838)

    Signed-off-by: Thomas Graves <tgraves@nvidia.com>
    tgravescs authored Mar 1, 2021
    Configuration menu
    Copy the full SHA
    7e210c2 View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2021

  1. Add a shim provider for Spark 3.2.0 development branch (NVIDIA#1704)

    Signed-off-by: Gera Shegalov <gera@apache.org>
    
    Add a shim provider for Spark 3.2.0 development branch. Closes NVIDIA#1490
    - fix overflows in aggregate buffer for GpuSum by wiring the explicit output column type
    - unit tests for the new shim
    - consolidate version profiles in the parent pom
    gerashegalov authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    51049a6 View commit details
    Browse the repository at this point in the history
  2. Cleanup unused Jenkins files and scripts (NVIDIA#1829)

    * Cleanup unused Jenkins files and scripts
    
    NVIDIA#1568
    
    Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job
    
    Remove unused Dockerfiles
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    
    * rm Dockerfile.integration.ubuntu16
    
    * Restore Databricks nightly scripts
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    NvTimLiu authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    28b00a7 View commit details
    Browse the repository at this point in the history
  3. Spark 3.1.1 shim no longer a snapshot shim (NVIDIA#1832)

    * Spark 3.1.1 shim no longer a snapshot shim
    
    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    
    * Remove 3.1.0, 3.1.0-SNAPSHOT, and 3.1.1-SNAPSHOT support
    
    * Remove obsolete comment
    jlowe authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    e614ef4 View commit details
    Browse the repository at this point in the history
  4. Update to note support for 3.0.2 (NVIDIA#1842)

    * Update to note support for 3.0.2
    
    Signed-off-by: Sameer Raheja <sraheja@nvidia.com>
    
    * Update FAQ to reflect 3.0.2 and 3.1.1 support
    
    Signed-off-by: Sameer Raheja <sraheja@nvidia.com>
    sameerz authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    fc9cecf View commit details
    Browse the repository at this point in the history
  5. Fix fails on the mortgage ETL test (NVIDIA#1845)

    In the 'Map' of dataset-format, the function of 'Run.csv()/Run.orc()/Run.parquet' will be executed one by one, then it causes the dataset format error, because the dataset format in the current test is 'parquet'
    
    Change 'Run.csv()/Run.orc()/Run.parquet' into the lambda expressions, to avoid running the 'Run.xxx()' functions in the dataFrameFormatMap
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    NvTimLiu authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    85bfacb View commit details
    Browse the repository at this point in the history
  6. Have most of range partitioning run on the GPU (NVIDIA#1796)

    Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
    revans2 authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    63a2e3d View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c776be9 View commit details
    Browse the repository at this point in the history
  8. Fix NullPointerException on null partition insert (NVIDIA#1744)

    Port apache/spark#31320 to close NVIDIA#1735
    
    Signed-off-by: Gera Shegalov <gera@apache.org>
    gerashegalov authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    e06c226 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    5b93033 View commit details
    Browse the repository at this point in the history
  10. Merge pull request NVIDIA#1848 from jlowe/fix-merge

    Fix merge conflict with branch-0.4
    jlowe authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    923fa4e View commit details
    Browse the repository at this point in the history
  11. Update changelog for 0.4 (NVIDIA#1849)

    * Update changelog for 0.4
    
    Signed-off-by: Sameer Raheja <sraheja@nvidia.com>
    
    * Update generate-changelog script
    
    Signed-off-by: Sameer Raheja <sraheja@nvidia.com>
    sameerz authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    dea867a View commit details
    Browse the repository at this point in the history
  12. Merge pull request NVIDIA#1850 from NVIDIA/branch-0.4

    [auto-merge] branch-0.4 to branch-0.5 [skip ci] [bot]
    nvauto authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    95c3e75 View commit details
    Browse the repository at this point in the history
  13. Refactor join code to reduce duplicated code (NVIDIA#1839)

    * Refactor join code to reduce duplicated code
    
    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    
    * Move nodeName override to base class
    jlowe authored Mar 2, 2021
    Configuration menu
    Copy the full SHA
    40c0eda View commit details
    Browse the repository at this point in the history

Commits on Mar 3, 2021

  1. Add shim for Spark 3.0.3 (NVIDIA#1834)

    * Add shim for Spark 3.0.3
    
    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    
    * Add premerge testing for Spark 3.0.2 and Spark 3.0.3
    jlowe authored Mar 3, 2021
    Configuration menu
    Copy the full SHA
    19d1f05 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    32213fa View commit details
    Browse the repository at this point in the history
  3. Fix Part Suite Tests (NVIDIA#1852)

    * Fix Part Suite Tests
    
    Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
    
    * Addressed review comments
    revans2 authored Mar 3, 2021
    Configuration menu
    Copy the full SHA
    24ab0ae View commit details
    Browse the repository at this point in the history
  4. Add shim for Spark 3.1.2 (NVIDIA#1836)

    * Add shim for Spark 3.1.2
    
    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    
    * Add Spark 3.1.2 to premerge testing
    jlowe authored Mar 3, 2021
    Configuration menu
    Copy the full SHA
    eab507e View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2021

  1. fix shuffle manager doc on ucx library path (NVIDIA#1858)

    * fix shuffle manager doc on ucx library path
    
    Signed-off-by: Rong Ou <rong.ou@gmail.com>
    
    * remove ld library path line
    
    Signed-off-by: Rong Ou <rong.ou@gmail.com>
    rongou authored Mar 4, 2021
    Configuration menu
    Copy the full SHA
    ad0b6d9 View commit details
    Browse the repository at this point in the history
  2. Disable coalesce batch spilling to avoid cudf contiguous_split bug (N…

    …VIDIA#1871)
    
    Signed-off-by: Jason Lowe <jlowe@nvidia.com>
    jlowe authored Mar 4, 2021
    Configuration menu
    Copy the full SHA
    dc2847c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3c243c7 View commit details
    Browse the repository at this point in the history
  4. Fix tests for Spark 3.2.0 shim (NVIDIA#1869)

    Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
    revans2 authored Mar 4, 2021
    Configuration menu
    Copy the full SHA
    2439b4b View commit details
    Browse the repository at this point in the history

Commits on Mar 5, 2021

  1. Add in support for DateAddInterval (NVIDIA#1841)

    Signed-off-by: Niranjan Artal <nartal@nvidia.com>
    nartal1 authored Mar 5, 2021
    Configuration menu
    Copy the full SHA
    6e57e27 View commit details
    Browse the repository at this point in the history
  2. Merge pull request NVIDIA#1875 from jlowe/fix-merge

    Fix merge conflict with branch-0.4
    pxLi authored Mar 5, 2021
    Configuration menu
    Copy the full SHA
    60fb754 View commit details
    Browse the repository at this point in the history
  3. spillable cache for GpuCartesianRDD

    Signed-off-by: sperlingxx <lovedreamf@gmail.com>
    sperlingxx committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    1a32484 View commit details
    Browse the repository at this point in the history
  4. lazy cache

    sperlingxx committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    b908c73 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4718d00 View commit details
    Browse the repository at this point in the history
  6. fix merge conflicts

    sperlingxx committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    6fd391b View commit details
    Browse the repository at this point in the history
  7. fix merge conflicts

    sperlingxx committed Mar 5, 2021
    Configuration menu
    Copy the full SHA
    8a46305 View commit details
    Browse the repository at this point in the history