Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Alluxio auto mount feature (#5925)
* Add Alluxio auto mount feature Mount the cloud bucket to Alluxio when driver converts FileSourceScanExec to GPU plan The Alluxio master should be the same node as Spark driver node when using this feature Introduce new configs: spark.rapids.alluxio.automount.enabled spark.rapids.alluxio.bucket.regex spark.rapids.alluxio.mount.cmd Signed-off-by: Gary Shen <gashen@nvidia.com> * Set access key and secret key when mounting Read access key and secret key from spark config or environment variables Use the key when running alluxio mount Default ALLUXIO_HOME as /opt/alluxio-2.8.0 Signed-off-by: Gary Shen <gashen@nvidia.com> * Print thread id Signed-off-by: Gary Shen <gashen@nvidia.com> * Check mounted point Signed-off-by: Gary Shen <gashen@nvidia.com> * Fix parameter mistake Signed-off-by: Gary Shen <gashen@nvidia.com> * Add log Signed-off-by: Gary Shen <gashen@nvidia.com> * Fix mount command Signed-off-by: Gary Shen <gashen@nvidia.com> * Use whitespace to split Signed-off-by: Gary Shen <gashen@nvidia.com> * Update docs Signed-off-by: Gary Shen <gashen@nvidia.com> * Add synchronized to mount command Signed-off-by: Gary Shen <gashen@nvidia.com> * Update some logs Signed-off-by: Gary Shen <gashen@nvidia.com> * Update docs Signed-off-by: Gary Shen <gashen@nvidia.com> * Use Properties to read Alluxio config Signed-off-by: Gary Shen <gashen@nvidia.com> * Fix build error Signed-off-by: Gary Shen <gashen@nvidia.com> * Add empty line to pass mvn verify Signed-off-by: Gary Shen <gashen@nvidia.com> * Fix comments Check both access key and secret Update document to refer to auto mount section Explain more about limitation Use /bucket in mountedBucket to match fs mount output Use camel case to name variable Use URI to parse the fs mount output Signed-off-by: Gary Shen <gashen@nvidia.com> * Address comments Use logDebug Write new functions to return the replaceFunc Use URI to parse the scheme and bucket Signed-off-by: Gary Shen <gashen@nvidia.com> * Fix the command without su user Support to run the alluxio command without su by Process(String) Signed-off-by: Gary Shen <gashen@nvidia.com> * Don't use URI since s3 path may include space Signed-off-by: Gary Shen <gashen@nvidia.com> * Update sql-plugin/src/main/scala/com/nvidia/spark/rapids/AlluxioUtils.scala Remove risk log * Fix comments Update docs Add a space in runAlluxioCmd Signed-off-by: Gary Shen <gashen@nvidia.com> * Fix the indentation Signed-off-by: Gary Shen <gashen@nvidia.com> * Set default value of alluxio.cmd correct indent Signed-off-by: Gary Shen <gashen@nvidia.com>
- Loading branch information