-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query execution tracing and replay tool #9668
Comments
CC: @aditi-pandit |
+1. This would be really useful. |
Similar as Gluten's microbenchmark reproduce tool. It will be super useful for debug and performance analysis. Good feature! |
Summary: Add a query tracer to log the input data, and metadata (including query configurations, connector properties, and query plans). This logged data and metadata can be used to replay the operations of a specific operator or pipeline. Part of #9668 Pull Request resolved: #10774 Reviewed By: Yuhta Differential Revision: D61514971 Pulled By: xiaoxmeng fbshipit-source-id: 9a0b901ee1475a6c35169fe77eb19e797e31e210
Summary: Create a directory named `$QueryTraceBaseDir/$taskId` when a task is initiated, if query tracing is enabled. This directory will store metadata related to the task, including the query plan node tree, query configurations, and connector properties. Part of #9668 Pull Request resolved: #10815 Reviewed By: Yuhta Differential Revision: D61808438 Pulled By: xiaoxmeng fbshipit-source-id: 57eff8f4b70405ba5c60fcd8315b025b22c2317b
Summary: Create a directory named `$QueryTraceBaseDir/$taskId` when a task is initiated, if query tracing is enabled. This directory will store metadata related to the task, including the query plan node tree, query configurations, and connector properties. Part of facebookincubator#9668 Pull Request resolved: facebookincubator#10815 Reviewed By: Yuhta Differential Revision: D61808438 Pulled By: xiaoxmeng fbshipit-source-id: 57eff8f4b70405ba5c60fcd8315b025b22c2317b
Summary: Create a directory named `$QueryTraceBaseDir/$taskId` when a task is initiated, if query tracing is enabled. This directory will store metadata related to the task, including the query plan node tree, query configurations, and connector properties. Part of facebookincubator#9668 Pull Request resolved: facebookincubator#10815 Reviewed By: Yuhta Differential Revision: D61808438 Pulled By: xiaoxmeng fbshipit-source-id: 57eff8f4b70405ba5c60fcd8315b025b22c2317b
This tool looks great, but I have two concerns:
I suggest adding a config @duanmeng @xiaoxmeng What are your thoughts on this suggestion? I look forward to hearing your suggestions. |
@xiaodouchen Thanks for your review.
Garbage collection is a different thing. cc @xiaoxmeng |
|
Summary: Velox can record the query metadata (query plan and configs) during task creation and input vectors of the traced operator, see #10774 and #10815. This PR adds a query replayer, it can be used to replay a query locally using the metadata and input vectors from the production environment. It supports showing the summary of a query at present, and more traced operators' replaying supports will be added in the future. Also, this PR adds two query configs `query_trace_max_bytes` and `query_trace_task_reg_exp` to constraint the record input data size and trace tasks respectively to ensure the stability of the cluster in the prod. Part of #9668 Pull Request resolved: #10897 Reviewed By: tanjialiang Differential Revision: D62336733 Pulled By: xiaoxmeng fbshipit-source-id: d196738dfa92c29fe5de67a944f652a328903814
Summary: Create a `QueryDataWriter` in `exec::TableWriter` if the query trace is enabled, recording the input vectors batch by batch. Each operator writes its data to the directory `$rootDir/$pipelineId/$driverId/data`. The recorded data will be used to replay the execution of `exec::TableWriter`, which will be supported in the follow-up. Design notes: https://docs.google.com/document/d/1crIIeVz4tWKYQnBoHoxrv2i-4zAML9HSYLps8h5SDrc/edit#heading=h.y6j2ojtr7hm9 Part of #9668 Pull Request resolved: #10910 Reviewed By: pedroerp Differential Revision: D63444416 Pulled By: xiaoxmeng fbshipit-source-id: ddd74ff6dd56de7bce31ec536035b32211453364
Summary: Adds `TableWriterReplayer` to facilitate the replaying of `TableWriter` operator. Uses the given plan node ID to find the traced `TableWriteNode` from the traced plan. It helps create a new `TableWriterNode` and rebuild a query plan with a `QueryTraceScanNode`, then apply the traced configurations, and rerun. `QueryTraceScanNode` holds the traced data type and dir for a given plan node ID. These information can be utilized to build the `QueryTraceScan` operator. It creates a `QueryDataReader` using the traced data type and input data file. To find the right input data file for replaying, we need to use both the pipeline ID and driver ID, which are only known during operator creation, so we need to figure out the input traced data file and the output type dynamically. Part of #9668 Pull Request resolved: #11100 Reviewed By: tanjialiang Differential Revision: D63774083 Pulled By: xiaoxmeng fbshipit-source-id: 912bef3cb20d9b1a1685af625ba2f319e2dc7509
Summary: Add partitioned output trace replayer to facilitate debugging for partitioned output operator with complex input. part of facebookincubator#9668 Reviewed By: xiaoxmeng Differential Revision: D63959956 Pulled By: tanjialiang
Summary: Add partitioned output trace replayer to facilitate debugging for partitioned output operator with complex input. part of #9668 Pull Request resolved: #11175 Reviewed By: xiaoxmeng Differential Revision: D63959956 Pulled By: tanjialiang fbshipit-source-id: a1519cd1191222316ec03f7e5c219d03c5e6a5be
Description
Add query execution tracing and replay tool to facilitate query analysis. The tool shall allow us to replay a part of query execution on a local computer instead of replaying the whole query in a production environment or in a real Prestissimo cluster. The tool consists two parts:
(1) trace collection: run a query with trace collection enabled through query configs (and the corresponding session properties in Prestissimo context). The query execution will collect the trace by dumping the input vectors of a particular set of specified operators (data) and the corresponding query plan info (meta data) into a specified storage location;
(2) trace replay: constructs the a sub-query plan using the dumped query plan meta, and then load the dumped input vectors into memory and feed into the constructed sub-query plan for replay. If the input is too large, then we can build a special source operator to read the dumped input vector from storage in batches.
The replay can be done at different level: operator level, pipeline level and task level. We can start with the operator level and extend to pipeline and task level next.
cc @mbasmanova @duanmeng @huamn
The text was updated successfully, but these errors were encountered: