Deploy AMS On Kubernetes

Requirements

If you want to deploy AMS on Kubernetes, you’d better get a sense of the following things.

Amoro Official Docker Image

You can find the official docker image at Amoro Docker Hub.

The following are images that can be used in a production environment.

apache/amoro

This is an image built based on the Amoro binary distribution package for deploying AMS.

apache/amoro-flink-optimizer

This is an image built based on the official version of Flink for deploying the Flink optimizer.

apache/amoro-spark-optimizer

This is an image built based on the official version of Spark for deploying the Spark optimizer.

Build AMS Docker Image

If you want to build images locally, you can find the build.sh script in the docker folder of the project and pass the following command:

./docker/build.sh amoro

or build the amoro-flink-optimizer image by:

./docker/build.sh amoro-flink-optimizer --flink-version <flink-version>

or build the amoro-spark-optimizer image by:

./docker/build.sh amoro-spark-optimizer --spark-version <spark-version>

Get Helm Charts

You can obtain the latest official release chart by adding the official Helm repository.

$ helm repo add amoro https://netease.github.io/amoro/charts
$ helm search repo amoro 
NAME           CHART VERSION    APP VERSION        DESCRIPTION           
amoro/amoro    0.1.0            0.7.1              A Helm chart for Amoro 

$ helm pull amoro/amoro 
$ tar zxvf amoro-*.tgz

Alternatively, you can find the latest charts directly from the Github source code.

$ git clone https://github.com/apache/amoro.git
$ cd amoro/charts
$ helm dependency build ./amoro

Install

When you are ready, you can use the following helm command to start

$ helm install <deployment-name> ./amoro 

After successful installation, you can access WebUI through the following command.

$ kubectl port-forward services/<deployment-name>-amoro-rest 1630:1630

Open browser to go web: http://localhost:1630

Access logs

Then, use pod name to get logs:

$ kubectl get pod
$ kubectl logs {amoro-pod-name}

Uninstall

$ helm uninstall <deployment-name>

Configuring Helm application.

Helm uses <chart>/values.yaml files as configuration files, and you can also copy this file for separate maintenance.

$ cp amoro/values.yaml my-values.yaml
$ vim my-values.yaml

And deploy Helm applications using independent configuration files.

$ helm install <deployment-name> ./amoro -f my-values.yaml

Enable Ingress

Ingress is not enabled by default. In production environments, it is recommended to enable Ingress to access the AMS Dashboard from outside the cluster.

ingress:
  enabled: true
  ingressClassName: "nginx"
  hostname: minikube.amoro.com

Configure the database.

AMS default is to use Derby database for storage. When the pod is destroyed, the data will also disappear. In production environments, we recommend using MySQL as the storage for system data.

amoroConf: 
  database:
    type: mysql
    driver: com.mysql.cj.jdbc.Driver
    url: <jdbc-uri>
    username: <mysql-user>
    password: <mysql-password>

Configure the Images

Helm charts deploy images by default using the latest tag. If you need to modify the image address, such as using a private repository or building your own image

image:
  repository: <your repository>
  pullPolicy: IfNotPresent
  tag: <your tag>
imagePullSecrets: [ ]

By default, the Flink Optimizer Container is enabled. You can modify the container configuration by changing the optimizer.flink section.

optimizer: 
  flink: 
    enabled: true
    ## container name, default is flink
    name: ~ 
    image:
      ## the image repository
      repository: apache/amoro-flink-optimizer
      ## the image tag, if not set, the default value is the same with amoro image tag.
      tag: ~
      pullPolicy: IfNotPresent
      ## the location of flink optimizer jar in image.
      jobUri: "local:///opt/flink/usrlib/optimizer-job.jar"
    properties: {
      "flink-conf.taskmanager.memory.managed.size": "32mb",
      "flink-conf.taskmanager.memory.network.max": "32mb",
      "flink-conf.taskmanager.memory.network.min": "32mb"
    }

Configure the RBAC

By default, Helm Chart creates a service account, role, and role bind for Amoro deploy. You can also modify this configuration to use an existing account.

# ServiceAccount of Amoro to schedule optimizer.
serviceAccount:
  # Specifies whether a service account should be created or using an existed account
  create: true
  # Annotations to add to the service account
  annotations: { }
  # Specifies ServiceAccount name to be used if `create: false`
  name: ~
  # if `serviceAccount.create` == true. role and role-bind will be created
  rbac:
    # create a role/role-bind if automatically create service account
    create: true
    # create a cluster-role and cluster-role-bind if cluster := true
    cluster: false

Notes:

  • If serviceAccount.create is false, you must provide a serviceAccount.name and create the serviceAccount beforehand.
  • If serviceAccount.rbac.create is false, the role and role-bind will not be created automatically.
  • You can set serviceAccount.rbac.cluster to true, which will create a cluster-role and cluster-role-bind instead of a role and role-bind.

By default, the serviceAccount will be used to create the flink-optimizer. Therefore, if you need to schedule the flink-optimizer across namespaces, please create a cluster-role or use your own created serviceAccount.