Query Profile Structure and Metrics
Structure of Query Profile
The structure of a Query Profile is closely related to the design of StarRocks' execution engine and consists of the following five parts:
- Fragment: Execution tree. A query is composed of one or more fragments.
- FragmentInstance: Each fragment can have multiple instances, each instance is called a FragmentInstance, and is executed by different computing nodes.
- Pipeline: A FragmentInstance is split into multiple pipelines. A pipeline is an execution chain consisting of a group of connected Operator instances.
- PipelineDriver: A Pipeline can have multiple instances, each instance is called a PipelineDriver, to fully utilize multiple computing cores.
- Operator: A PipelineDriver consists of multiple Operator instances.
Query Profile Merging Strategy
By analyzing the structure of Query Profile, you can easily observe that multiple FragmentInstances associated with the same Fragment have a high degree of similarity in structure. Similarly, multiple PipelineDrivers belonging to the same Pipeline also exhibit similar structural features. To reduce the volume of the Query Profile, StarRocks by default merges the FragmentInstance layer and the PipelineDriver layer. As a result, the original five-layer structure is simplified to three layers:
- Fragment
- Pipeline
- Operator
You can control this merging behavior through a session variable pipeline_profile_level
, which has two valid values:
1
(Default): StarRocks merges the metrics into a three-layer structure.2
: StarRocks does not merge the metrics. The original five-layer structure is retained.- Any other value will be treated as the default value
1
.
Generally, we do not recommend setting this parameter to 2
because the Query Profile with the five-layer structure has many limitations. For example, you cannot perform visualized analysis on the profile using any tools. Therefore, unless the merging process leads to the loss of crucial information, you do not need to adjust this parameter.
Metric Merging and MIN/MAX Values
When merging FragmentInstance and PipelineDriver, all metrics with the same name are merged. Only the minimum and maximum values of each metric in all concurrent instances are recorded. Different types of metrics use different merging strategies:
- Time-related metrics take the average. For example:
OperatorTotalTime
is the average time consumption of all concurrent instances.__MAX_OF_OperatorTotalTime
is the maximum time consumption among all concurrent instances.__MIN_OF_OperatorTotalTime
is the minimum time consumption among all concurrent instances.
- OperatorTotalTime: 2.192us
- __MAX_OF_OperatorTotalTime: 2.502us
- __MIN_OF_OperatorTotalTime: 1.882us
- Non-time-related metrics are summed. For example:
PullChunkNum
is the sum of this metric in all concurrent instances.__MAX_OF_PullChunkNum
is the maximum value of this metric among all concurrent instances.__MIN_OF_PullChunkNum
is the minimum value of this metric among all concurrent instances.
- PullChunkNum: 146.66K (146660)
- __MAX_OF_PullChunkNum: 24.45K (24450)
- __MIN_OF_PullChunkNum: 24.435K (24435)
- Some metrics without extreme values have the same value in all concurrent instances, for example:
DegreeOfParallelism
.
Usually, if there is a significant difference between MIN and MAX values, it indicates a high probability of data skew. Possible scenarios include aggregation and join operations.
- OperatorTotalTime: 2m48s
- __MAX_OF_OperatorTotalTime: 10m30s
- __MIN_OF_OperatorTotalTime: 279.170us
Query Profile Metrics List
The Query Profile includes a multitude of metrics providing detailed information about query execution. In most cases, you only need to focus on the execution time of operators and the size of processed data. Once you identify bottlenecks, you can address them specifically.
Summary Metrics
Total
Description: The total time consumed by the query, including Planning, Executing, and Profiling phase durations.
Query State
Description: Query state, possible states include Finished, Error, and Running.
Execution Overview Metrics
FrontendProfileMergeTime
Description: Query Profile processing time on the Frontend (FE) side.
QueryAllocatedMemoryUsage
Description: Cumulative allocated memory across all compute nodes.
QueryDeallocatedMemoryUsage
Description: Cumulative deallocated memory across all compute nodes.
QueryPeakMemoryUsage
Description: Maximum peak memory across all compute nodes.
QueryExecutionWallTime
Description: Wall time of the execution.
QueryCumulativeCpuTime
Description: Cumulative CPU time across all compute nodes.
QueryCumulativeOperatorTime
Description: Cumulative time across all nodes. This is a simple linear accumulation, but in reality, execution times of different operators may overlap. This parameter serves as the denominator for calculating the percentage of time spent on each operator.
QueryCumulativeNetworkTime
Description: Cumulative network time of all Exchange nodes. Similar to cumulative operator time, actual execution times of different Exchanges may overlap.
QueryCumulativeScanTime
Description: Cumulative IO time of all Scan nodes. Similar to cumulative operator time, actual execution times of different Scan operations may overlap.
QueryPeakScheduleTime
Description: Maximum ScheduleTime metric across all Pipelines.
QuerySpillBytes
Description: Size of data spilled to local disks.
ResultDeliverTime
Description: Additional time to transfer results. For query statements, this parameter refers to the time it takes to send data back to the client; for insert statements, it refers to the time it takes to write data to the storage layer.
Fragment Metrics
InstanceNum
Description: Number of all FragmentInstances for this Fragment.
InstanceIds
Description: IDs of all FragmentInstances for this Fragment.
BackendNum
Description: Number of BEs participating in the execution of this Fragment.
BackendAddresses
Description: Addresses of all BEs participating in the execution of this Fragment.
FragmentInstancePrepareTime
Description: Time spent in the Fragment Prepare phase.
InstanceAllocatedMemoryUsage
Description: Cumulative allocated memory for all FragmentInstances under this Fragment.
InstanceDeallocatedMemoryUsage
Description: Cumulative deallocated memory for all FragmentInstances under this Fragment.
InstancePeakMemoryUsage
Description: The peak memory usage across all FragmentInstances under this Fragment.
Pipeline Metrics
The relationship between core metrics is illustrated in the following diagram:
- DriverTotalTime = ActiveTime + PendingTime + ScheduleTime
- ActiveTime = ∑ OperatorTotalTime + OverheadTime
- PendingTime = InputEmptyTime + OutputFullTime + PreconditionBlockTime + PendingFinishTime
- InputEmptyTime = FirstInputEmptyTime + FollowupInputEmptyTime
DegreeOfParallelism
Description: Degree of pipeline execution parallelism.
TotalDegreeOfParallelism
Description: Sum of degrees of parallelism. Since the same Pipeline may execute on multiple machines, this item aggregates all values.
DriverPrepareTime
Description: Time taken by the Prepare phase. This metric is not included in DriverTotalTime.
DriverTotalTime
Description: Total execution time of the Pipeline, excluding the time spent in the Prepare phase.
ActiveTime
Description: Execution time of the Pipeline, including the execution time of each operator and the overall framework overhead, such as time spent in invoking methods like has_output, need_input, etc.
PendingTime
Description: Time the Pipeline is blocked from being scheduled for various reasons.
InputEmptyTime
Description: Time the Pipeline is blocked due to an empty input queue.
FirstInputEmptyTime
Description: Time the Pipeline is first blocked due to an empty input queue. The first blocking time is separately calculated because the first blocking is mainly caused by Pipeline dependencies.