šļø Load data using Spark connector (recommended)
StarRocks provides a self-developed connector named StarRocks Connector for Apache Spark⢠(Spark connector for short) to help you load data into a StarRocks table by using Spark. The basic principle is to accumulate the data and then load it all at a time into StarRocks through STREAM LOAD. The Spark connector is implemented based on Spark DataSource V2. A DataSource can be created by using Spark DataFrames or Spark SQL. And both batch and structured streaming modes are supported.
šļø Load data in bulk using Spark Load
This load uses external Apache Spark⢠resources to pre-process imported data, which improves import performance and saves compute resources. It is mainly used for initial migration and large data import into StarRocks (data volume up to TB level).