Deploy on Kubernetes
The deployment of a compute-storage decoupled cluster on Kubernetes involves four main steps:
- Pre-deployment preparation.
- Deploying the Doris Operator.
- Deploying the compute-storage decoupled cluster.
- Creating the storage backend.
Step 1: Pre-deployment preparationβ
To deploy a compute-storage decoupled cluster on Kubernetes, you need to deploy FoundationDB in advance. If using virtual machines (VMs), ensure that the VMs can be accessed by services within the Kubernetes cluster. For deploying FoundationDB on VMs, refer to the "Pre-deployment Preparation" section of the compute-storage decoupling deployment guide. For deployment on Kubernetes, follow the instructions in the FoundationDB on Kubernetes deployment guide.
Step 2: Deploy the operatorβ
- Create the resource definitions:
kubectl create -f https://raw.githubusercontent.com/apache/doris-operator/master/config/crd/bases/crds.yaml
- Deploy the Doris Operator and its associated RBAC rules:
kubectl apply -f https://raw.githubusercontent.com/apache/doris-operator/master/config/operator/disaggregated-operator.yaml
Expected Results:
kubectl -n doris get pods
NAME READY STATUS RESTARTS AGE
doris-operator-6b97df65c4-xwvw8 1/1 Running 0 19s
Step 3: Deploy the compute-storage decoupled clusterβ
- Download the example deployment configuration for the compute-storage decoupled cluster:
curl -O https://raw.githubusercontent.com/apache/doris-operator/master/doc/examples/disaggregated/cluster/ddc-sample.yaml
- Configure FoundationDB access information.
The compute-storage decoupled version of Doris uses FoundationDB to store metadata. The access details for FoundationDB can be provided in the DorisDisaggregatedCluster underspec.metaService.fdb
in one of two ways: by directly specifying the access address or by using a ConfigMap that includes the access information.
- Direct Access Address Configuration
If FoundationDB is deployed outside of Kubernetes, you can specify its access address directly:Here, ${fdbAddress} refers to the client access address for FoundationDB. On Linux VMs, this is typically stored inspec:
metaService:
fdb:
address: ${fdbAddress}/etc/foundationdb/fdb.cluster
. For more details, refer to the FoundationDB cluster file documentation. - Using a ConfigMap Containing Access Information
If FoundationDB is deployed using the fdb-kubernetes-operator, it will automatically generate a ConfigMap containing the access details. The name of the generated ConfigMap is based on the resource name used for the deployment and appended with "-config".
To obtain the ConfigMap, refer to the "Access Information" section in the FoundationDB on Kubernetes deployment guide. Once you have the ConfigMap name and namespace, configure the DorisDisaggregatedCluster as follows:Replace ${foundationdbConfigMapName} with the name of the ConfigMap and ${namespace} with the namespace where it is located.spec:
metaService:
fdb:
configMapNamespaceName:
name: ${foundationdbConfigMapName}
namespace: ${namespace}
- Follow the instructions in the compute-storage decoupling Kubernetes deployment documentation to configure the metadata service (metaService configuration), the FE cluster specifications (FE cluster configuration), and the compute groups (compute group configuration). After completing the configuration, deploy the resources with the following command:
kubectl apply -f ddc-sample.yaml
Once the resources are applied, wait for the cluster to be automatically set up. The expected output is:
kubectl get ddc
NAME CLUSTERHEALTH FEPHASE CGCOUNT CGAVAILABLECOUNT CGFULLAVAILABLECOUNT
test-disaggregated-cluster green Ready 2 2 2
Step 4: Create a remote storage backendβ
Once the compute-storage decoupled cluster is set up, you need to execute the appropriate CREATE STORAGE VAULT
SQL statement through the client to create the storage backend for data persistence. You can enter the FE container and use the MySQL client to perform the creation.
- Get the service.
After the cluster is deployed, you can view the services exposed by the Doris Operator with the following command:
kubectl get svc
The output will be similar to:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-disaggregated-cluster-fe ClusterIP 10.96.147.97 <none> 8030/TCP,9020/TCP,9030/TCP,9010/TCP 15m
test-disaggregated-cluster-fe-internal ClusterIP None <none> 9030/TCP 15m
test-disaggregated-cluster-ms ClusterIP 10.96.169.8 <none> 5000/TCP 15m
test-disaggregated-cluster-cg1 ClusterIP 10.96.47.90 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 14m
test-disaggregated-cluster-cg2 ClusterIP 10.96.50.199 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 14m
- MySQL client access.
To create a pod with the MySQL client inside the Kubernetes cluster, use the following command:
kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never -- /bin/bash
Within the pod, you can connect to the Doris cluster using the FE service name:
mysql -uroot -P9030 -h test-disaggregated-cluster-fe
- Create the Storage Backend.
To create a storage backend using an S3-compatible object storage, use the following example:
a. Create an S3 Storage Vault:
CREATE STORAGE VAULT IF NOT EXISTS s3_vault
PROPERTIES (
"type"="S3",
"s3.endpoint" = "oss-cn-beijing.aliyuncs.com",
"s3.region" = "bj",
"s3.bucket" = "bucket",
"s3.root.path" = "big/data/prefix",
"s3.access_key" = "ak",
"s3.secret_key" = "sk",
"provider" = "OSS"
);
b. Set the Default Storage Vault.
SET s3_vault AS DEFAULT STORAGE VAULT;
The configuration details in the above commands are for illustrative purposes only and are not valid for real-world scenarios. Please refer to the Managing Storage Vaults section for instructions on creating a usable storage backend.