Skip to main content
Version: Candidate-3.4

Feature Support: Shared-data Clusters

tip

Each of the features below lists the version number that they were added in. If you are deploying a new cluster please deploy the latest patch release of version 3.2 or higher.

Overview

Shared-data StarRocks clusters feature a disaggregated storage and compute architecture. This allows data to be stored in remote storage, leading to lower storage costs, optimized resource isolation, and better service elasticity compared to a shared-nothing cluster.

This document outlines the feature support for shared-data clusters, covering deployment methods, storage configurations, caching mechanisms, Compaction, Primary Key table functionalities, and performance test results.

Deployment

Shared-data clusters support deployments on physical/virtual machines and on Kubernetes via Operator.

Both deployment solutions have the following limitations:

  • Mixed deployment of shared-nothing and shared-data mode is not supported.
  • The transformation from a shared-nothing cluster to a shared-data cluster or vice versa is not supported.
  • Heterogeneous deployments are not supported, meaning the hardware specifications of all CN nodes within a cluster must be the same.

StarRocks Kubernetes Operator

StarRocks offers the StarRocks Kubernetes Operator for shared-data deployment on Kubernetes.

You can scale shared-data clusters by the following methods:

  • Manual operations.
  • Automatic scaling using Kubernetes HPA (Horizontal Pod Autoscaler) strategies.

Storage

Shared-data clusters support building storage volumes on HDFS and object storage.

HDFS

Location

StarRocks supports the following locations for HDFS storage volume:

  • HDFS: hdfs://<host>:<port>/

    NOTE

    From v3.2, storage volumes support HDFS clusters with the NameNode HA mode enabled.

  • WebHDFS (Supported from v3.2): webhdfs://<host>:<http_port>/

  • ViewFS (Supported from v3.2): viewfs://<ViewFS_cluster>/

Authentication

StarRocks supports the following authentication methods for HDFS storage volume:

  • Basic

  • Username (Supported from v3.2)

  • Kerberos Ticket Cache (Supported from v3.2)

    NOTE

    StarRocks does not support automatic ticket refresh. You need to set up crontab tasks to refresh the ticket.

Authentication using Kerberos Keytab and Principal ID is not yet supported.

Usage notes

StarRocks supports storage volumes on HDFS and object storage. However, only one HDFS storage volume is allowed in each StarRocks instance. Creating multiple HDFS storage volumes may cause unknown behaviors of StarRocks.

Object storage

Location

StarRocks supports the following object storage services for storage volumes:

  • S3-compatible object storage services: s3://<s3_path>
    • AWS S3
    • GCS, OSS, OBS, COS, TOS, KS3, MinIO, and Ceph S3
  • Azure Blob Storage (Supported from v3.1.1): azblob://<azblob_path>

Authentication

StarRocks supports the following authentication methods for different object storage services:

  • AWS S3
    • AWS SDK
    • IAM user-based Credential
    • Instance Profile
    • Assumed Role
  • GCS, OSS, OBS, COS, TOS, KS3, MinIO, and Ceph S3
    • Access Key pair
  • Azure Blob Storage
    • Shared Key
    • Shared Access Signatures (SAS)

Partitioned Prefix

From v3.2.4, StarRocks supports creating storage volumes with the Partitioned Prefix feature for S3-compatible object storage systems. When this feature is enabled, StarRocks distributes the data into multiple partitions (sub-paths) under the bucket. It can easily multiply StarRocks' read and write performance on data files stored in the bucket.

Storage volumes

  • From v3.1.0 onwards, storage volumes can be created using the CREATE STORAGE VOLUME statement, and this method is recommended in later versions.
  • The internal catalog default_catalog in shared-data clusters uses the default storage volume for data persistence. You can assign different storage volumes for databases and tables in default_catalog by setting the property storage_volume. If not configured, the property storage_volume is inherited in the order of catalog, database, and table.
  • Currently, storage volumes can be used only for storing data in cloud-native tables. Future support will include external storage management, data loading, and backup capabilities.

Cache

Cache types

File Cache

File Cache was the initial caching mechanism introduced along with the shared-data cluster. It loads the cache at the segment file level. File Cache is not recommended in v3.1.7, v3.2.3, and later versions.

Data Cache

Data Cache is supported from v3.1.7 and v3.2.3 onwards to replace File Cache in earlier versions. Data Cache loads data from remote storage in blocks (on the order of MBs) on demand, without needing to load the entire file. It is recommended in the later versions and enabled by default in v3.2.3 and later.

Data Cache Warmup

StarRocks v3.3.0 introduces the Data Cache Warmup feature to accelerate queries in data lakes and shared-data clusters. Data Cache Warmup is an active process of populating the cache. By executing CACHE SELECT, you can proactively fetch the desired data from remote storage in advance.

Configurations

  • Table properties:
    • datacache.enable: Whether to enable the local disk cache. Default: true.
    • datacache.partition_duration: The validity duration of the cached data.
  • BE configurations:
    • starlet_use_star_cache: Whether to enable Data Cache.
    • starlet_star_cache_disk_size_percent: The percentage of disk capacity that Data Cache can use at most in a shared-data cluster.

Capabilities

  • Data loading generates a local cache, whose eviction is only managed by the cache capacity control mechanism instead of partition_duration.
  • StarRocks supports setting up regular tasks for Data Cache Warmup.

Limitations

  • StarRocks does not support multiple replicas for cached data.

Compaction

Observability

Partition-level Compaction status

From v3.1.9 onwards, you can view the Compaction status of partitions by querying information_schema.partitions_meta.

We recommend monitoring the following key metrics:

  • AvgCS: Average Compaction score of all tablets in the partition.
  • MaxCS: Maximum Compaction score among all tablets in the partition.

Compaction task status

From v3.2.0 onwards, you can view the status and progress of Compaction tasks by querying information_schema.be_cloud_native_compactions.

We recommend monitoring the following key metrics:

  • PROGRESS: Current Compaction progress (in percentage) of the tablet.
  • STATUS: The status of the compaction task. If any error occurs, detailed error messages will be returned in this field.

Cancelling Compaction tasks

You can cancel specific compaction tasks using the CANCEL COMPACTION statement.

Example:

CANCEL COMPACTION WHERE TXN_ID = 123;

NOTE

The CANCEL COMPACTION statement must be executed on the Leader FE node.

Manual Compaction

From v3.1, StarRocks offers a SQL statement for manual Compaction. You can specify the table or partitions for compaction. For more information, refer to Manual Compaction.

Primary Key tables

The following table lists the major features of Primary Key tables and their support status in shared-data clusters:

FeatureSupported Version(s)Description
Primary Key tablesv3.1.0
Primary Key index persistencev3.2.0
v3.1.3
  • Currently, shared-data clusters support Primary Key index persistence on local disks.
  • Persistence in remote storage will be supported in future releases.
Partial Updatev3.1.0Shared-data clusters support Partial Update in Row mode from v3.1.0 onwards and in Column mode from v3.3.1 onwards.
Conditional Updatev3.1.0Currently, the condition only supports 'Greater'.
Hybrid row-column storageTo be supported in future releases.

Query performance

The following test compares the query performance of a shared-data cluster with Data Cache disabled, one with Data Cache enabled, one that queries the dataset in Hive, and a shared-nothing cluster.

Hardware Specifications

The cluster used in the test includes one FE node and five CN/BE nodes. The hardware specifications are as follows:

VM providerAlibaba Cloud ECS
FE node8 Core 32 GB Memory
CN/BE node8 Core 64 GB Memory
Network bandwidth8 Gbits/s
DiskESSD

Software version

StarRocks v3.3.0

Dataset

SSB 1TB dataset

note

The dataset and queries used in this comparison are from the Star Schema Benchmark.

Test Results

The following table shows the performance test results on thirteen queries and the sum of each cluster. The unit of query latency is milliseconds (ms).

QueryShared-data Without Data CacheShared-data With Data CacheHive Catalog Without Data CacheShared-nothing
Q01274285896523555
Q02271470486383183
Q03190865881632980
Q04311358582346047997
Q05265977806291836794
Q06216437147244015602
Q0735271154903890419530
Q08248187368275986984
Q09210566667235875687
Q102823912166633942
Q1150027189475299719636
Q12103004919361468136
Q1373783386231536380
SUM23841283444333689100406

Conclusion

  • The query performance of the shared-data cluster with Data Cache disabled and Parallel Scan and I/O merge optimization enabled is 1.4 times that of the cluster that queries Hive data.
  • The query performance of the shared-data cluster with Data Cache enabled and Parallel Scan and I/O merge optimization enabled is 1.2 times that of the shared-nothing cluster.

Other features to be supported

  • Full-text inverted index
  • Hybrid row-column storage
  • Global dictionary object
  • Generated column
  • Backup and restore