Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace metadata during task creation #10815

Closed
wants to merge 1 commit into from

Conversation

duanmeng
Copy link
Collaborator

@duanmeng duanmeng commented Aug 22, 2024

Create a directory named $QueryTraceBaseDir/$taskId when a task is initiated,
if query tracing is enabled. This directory will store metadata related to the task,
including the query plan node tree, query configurations, and connector properties.

Part of #9668

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 22, 2024
Copy link

netlify bot commented Aug 22, 2024

Deploy Preview for meta-velox ready!

Name Link
🔨 Latest commit 6a180f2
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/66cd6cbbbc02a80008f7e532
😎 Deploy Preview https://deploy-preview-10815--meta-velox.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@duanmeng duanmeng force-pushed the task_trace branch 2 times, most recently from 724a030 to ed7f7cf Compare August 22, 2024 14:57
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duanmeng LGTM % minors.

velox/common/memory/Memory.cpp Show resolved Hide resolved
velox/exec/Task.h Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
@duanmeng duanmeng force-pushed the task_trace branch 2 times, most recently from 97d582b to 3437398 Compare August 25, 2024 03:12
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add PR description? thanks!

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duanmeng thanks for the update!

velox/core/QueryConfig.h Show resolved Hide resolved
velox/exec/Task.cpp Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/trace/test/QueryTraceTest.cpp Show resolved Hide resolved
@duanmeng duanmeng force-pushed the task_trace branch 2 times, most recently from 832f28d to fe70a21 Compare August 25, 2024 09:43
velox/docs/configs.rst Outdated Show resolved Hide resolved
velox/docs/configs.rst Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/Task.cpp Outdated Show resolved Hide resolved
velox/exec/trace/QueryTraceConfig.cpp Outdated Show resolved Hide resolved
velox/exec/trace/QueryTraceUtil.h Outdated Show resolved Hide resolved
velox/exec/trace/QueryTraceUtil.h Show resolved Hide resolved
velox/exec/trace/test/QueryTraceTest.cpp Show resolved Hide resolved
velox/exec/trace/test/QueryTraceTest.cpp Outdated Show resolved Hide resolved
@duanmeng duanmeng force-pushed the task_trace branch 6 times, most recently from c22f5be to c196c58 Compare August 26, 2024 16:37
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duanmeng LGTM. Thanks!

@facebook-github-bot
Copy link
Contributor

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@duanmeng duanmeng force-pushed the task_trace branch 3 times, most recently from 3239a53 to 0281dbb Compare August 27, 2024 03:33
Comment on lines 34 to 38
Folly::folly
gflags::gflags
glog::glog
fmt::fmt
${FILESYSTEM})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you need to explicitly link against all of these? I don't see any explicit #includes for any headers from these targets. The other targets will propagate their usage requirements, there is no need to add them manually.

@@ -13,7 +13,7 @@
# limitations under the License.

add_library(velox_query_trace_exec QueryMetadataWriter.cpp QueryTraceConfig.cpp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the issue for the linking error. It's an issue with the monolithic library build and it's helper functions that now need to be used. Sorry we haven't communicated that well yet. #9948 took a while to merge so the targets here were added after CI was run but before it got merged, sorry for the inconvenience.

For targets that are not standalone executables like tests or fuzzers etc. you have to use velox_add_library instead of add_library and velox_link_libraries instead of target_link_libraries now.

Copy link
Collaborator Author

@duanmeng duanmeng Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to use velox_add_library instead of add_library and velox_link_libraries instead of target_link_libraries now.

@assignUser Do you mean that I don't need to link velox_query_trace_exec explicitly in velox/benchmarks, velox/benchmarks/basic, and velox/dwio/dwrf/test if I use velox_add_library(velox_query_trace_exec... and velox_link_libraries(velox_query_trace_exec... in velox/exec/trace/CMakeList.txt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaoxmeng please don't merge before this is fixed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you still have to link the target where you are using it. When VELOX_MONO_LIBRARY=OFF everything behaves as if you were using the original functions so you still need those links. (especially for the tests and benchmark as those are unaffected by the mono build)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For targets that are not standalone executables like tests or fuzzers etc. you have to use velox_add_library instead of add_library and velox_link_libraries instead of target_link_libraries now.

@assignUser Thanks for your suggestion, no more linking errors and ci passed.

@facebook-github-bot
Copy link
Contributor

@xiaoxmeng merged this pull request in 763c19c.

Copy link

Conbench analyzed the 1 benchmark run on commit 763c19c0.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Sep 2, 2024
Summary:
Create a directory named `$QueryTraceBaseDir/$taskId` when a task is initiated,
if query tracing is enabled. This directory will store metadata related to the task,
including the query plan node tree, query configurations, and connector properties.

Part of facebookincubator#9668

Pull Request resolved: facebookincubator#10815

Reviewed By: Yuhta

Differential Revision: D61808438

Pulled By: xiaoxmeng

fbshipit-source-id: 57eff8f4b70405ba5c60fcd8315b025b22c2317b
Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Sep 3, 2024
Summary:
Create a directory named `$QueryTraceBaseDir/$taskId` when a task is initiated,
if query tracing is enabled. This directory will store metadata related to the task,
including the query plan node tree, query configurations, and connector properties.

Part of facebookincubator#9668

Pull Request resolved: facebookincubator#10815

Reviewed By: Yuhta

Differential Revision: D61808438

Pulled By: xiaoxmeng

fbshipit-source-id: 57eff8f4b70405ba5c60fcd8315b025b22c2317b
@duanmeng duanmeng mentioned this pull request Sep 15, 2024
facebook-github-bot pushed a commit that referenced this pull request Sep 24, 2024
Summary:
Velox can record the query metadata (query plan and configs)
during task creation and input vectors of the traced operator,
see #10774 and #10815.

This PR adds a query replayer, it can be used to replay a query locally
using the metadata and input vectors from the production environment.
It supports showing the summary of a query at present, and more traced
operators' replaying supports will be added in the future.

Also, this PR adds two query configs `query_trace_max_bytes` and
`query_trace_task_reg_exp` to constraint the record input data size
and trace tasks respectively to ensure the stability of the cluster
in the prod.

Part of #9668

Pull Request resolved: #10897

Reviewed By: tanjialiang

Differential Revision: D62336733

Pulled By: xiaoxmeng

fbshipit-source-id: d196738dfa92c29fe5de67a944f652a328903814
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants