Skip to main content

Kettle Doris Plugin

Kettle Doris Plugin​

Kettle Doris Plugin is used to write data from other data sources to Doris through Stream Load in Kettle.

This plug-in uses the Stream Load function of Doris to import data. It needs to be used in conjunction with the Kettle service.

About Kettle​

Kettle is an open source ETL (Extract, Transform, Load) tool, first developed by Pentaho, Kettle is one of the core components of the Pentaho product suite, mainly used for data integration and data processing, and can easily complete the tasks of extracting data from various sources, cleaning and transforming data, and loading it into the target system.

For more information, please refer to: https://pentaho.com/

User Manual​

Download Kettle and install​

Kettle download address: https://pentaho.com/download/#download-pentaho After downloading, unzip it and run spoon.sh to start kettle You can also compile it yourself, refer to the Compilation Chapter

Compile Kettle Doris Plugin​

cd doris/extension/kettle
mvn clean package -DskipTests

After compiling, unzip the plug-in package and copy it to the plugins directory of kettle

cd assemblies/plugin/target
unzip doris-stream-loader-plugins-9.4.0.0-343.zip
cp -r doris-stream-loader ${KETTLE_HOME}/plugins/
mvn clean package -DskipTests

Build a job​

Find Doris Stream Loader in the batch loading in Kettle and build a job create_zh.png

Click Start Running the Job to complete data synchronization running_zh.png

Parameter Description​

KeyDefault ValueRequiredComment
Step name--YStep name
fenodes--YDoris FE http address, supports multiple addresses, separated by commas
Database--YDoris write database
Target table--YDoris's write table
Username--YUsername to access Doris
Password--NPassword to access Doris
Maximum number of rows for a single import10000NMaximum number of rows for a single import
Maximum bytes for a single import10485760 (10MB)NMaximum byte size for a single import
Number of import retries3NNumber of retries after import failure
StreamLoad properties--NStreamload request header
Delete ModeNNWhether to enable delete mode. By default, Stream Load performs insert operations. After the delete mode is enabled, all Stream Load writes are delete operations.