Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended the FAQ #401

Merged
merged 2 commits into from
Jul 22, 2020
Merged

Extended the FAQ #401

merged 2 commits into from
Jul 22, 2020

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Jul 22, 2020

The extends the FAQ to include more questions that we have seen come up.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2 revans2 added the documentation Improvements or additions to documentation label Jul 22, 2020
@revans2 revans2 added this to the Jul 20 - Jul 31 milestone Jul 22, 2020
@revans2 revans2 self-assigned this Jul 22, 2020
@revans2
Copy link
Collaborator Author

revans2 commented Jul 22, 2020

build

docs/FAQ.md Outdated

### What parts of Apache Spark are accelerated?

Currently, A limited set of SQL and DataFrame operations are supported, please see the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'a limited set'

docs/FAQ.md Outdated

The RAPIDS Accelerator for Apache Spark requires version 3.0.0 of Apache Spark. Because the plugin
replaces parts of the physical plan that Apache Spark considers to be internal the code for those
plans can change even between bug fix releases. As a part of our process we try to stay on top of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Comma after process

docs/FAQ.md Outdated

### What is the road-map like?

Please take a look at the github repository https://github.com/nvidia/spark-rapids The have issue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to reword this a bit

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just got a question about speculative execution and the plugin if you think it is worth mentioning in this FAQ. #400

docs/FAQ.md Show resolved Hide resolved
docs/FAQ.md Outdated Show resolved Hide resolved
docs/FAQ.md Outdated Show resolved Hide resolved
docs/FAQ.md Outdated

### What is the road-map like?

Please take a look at the github repository https://github.com/nvidia/spark-rapids The have issue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Please take a look at the github repository https://github.com/nvidia/spark-rapids The have issue
Please take a look at the [github repository](https://github.com/nvidia/spark-rapids) which has issue

docs/FAQ.md Outdated Show resolved Hide resolved
docs/FAQ.md Outdated Show resolved Hide resolved
docs/FAQ.md Outdated
be problematic. In general if you are going to do 30 seconds or more of processing within a single
session the overhead can be amortized.

### Why is the size of my output Parquet/Orc file different?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Why is the size of my output Parquet/Orc file different?
### Why is the size of my output Parquet/ORC file different?

docs/FAQ.md Outdated

We have not evaluated the performance yet. DeltaEngine is not open source, so any analysis needs to
be done with Databricks in some form. When DeltaEngine is generally available if the terms of
service allow it we will look into doing a comparison.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
service allow it we will look into doing a comparison.
service allow it, we will look into doing a comparison.

docs/FAQ.md Outdated

There is no limit on the number of tasks per executor that you can run. Generally we recommend 2 to
6 tasks per executor and 1 GPU per executor. The GPU typically benefits from having 2 tasks run
in (parallel)[configs.md#sql.concurrentGpuTasks] on it at a time, assuming your GPU has enough
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link isn't rendering for some reason (the parallel one)

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2
Copy link
Collaborator Author

revans2 commented Jul 22, 2020

build

@revans2
Copy link
Collaborator Author

revans2 commented Jul 22, 2020

Thanks for the reviews I think I have addressed all of the comments.

Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @revans2

Copy link
Collaborator

@kuhushukla kuhushukla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow. ship it!

@revans2
Copy link
Collaborator Author

revans2 commented Jul 22, 2020

build

@revans2 revans2 merged commit 13309e9 into NVIDIA:branch-0.2 Jul 22, 2020
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Extended the FAQ

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>

* Addressed review comments

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Extended the FAQ

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>

* Addressed review comments

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
pxLi pushed a commit to pxLi/spark-rapids that referenced this pull request May 12, 2022
* update the commands in the FLAdminAPI

* update run destination to run number
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants