Kudu catalog
StarRocks supports Kudu catalogs from v3.3 onwards.
A Kudu catalog is a kind of external catalog that enables you to query data from Apache Kudu without ingestion.
Also, you can directly transform and load data from Kudu by using INSERT INTO based on Kudu catalogs.
To ensure successful SQL workloads on your Kudu cluster, your StarRocks cluster needs to integrate with the following important components:
- Metastore like your kudu file system or Hive metastore
Usage notesβ
You can only use Kudu catalogs to query data. You cannot use Kudu catalogs to drop, delete, or insert data into your Kudu cluster.
Integration preparationsβ
Before you create a Kudu catalog, make sure your StarRocks cluster can integrate with the storage system and metastore of your Kudu cluster.
NOTE
If an error indicating an unknown host is returned when you send a query, you must add the mapping between the host names and IP addresses of your KUDU cluster nodes to the /etc/hosts path.
Kerberos authenticationβ
If Kerberos authentication is enabled for your KUDU cluster or Hive metastore, configure your StarRocks cluster as follows:
- Run the
kinit -kt keytab_path principalcommand on each FE and each BE to obtain Ticket Granting Ticket (TGT) from Key Distribution Center (KDC). To run this command, you must have the permissions to access your KUDU cluster and Hive metastore. Note that accessing KDC with this command is time-sensitive. Therefore, you need to use cron to run this command periodically. - Add
JAVA_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf"to the $FE_HOME/conf/fe.conf file of each FE and to the $BE_HOME/conf/be.conf file of each BE. In this example,/etc/krb5.confis the save path of the krb5.conf file. You can modify the path based on your needs.