Skip to content

Latest commit

 

History

History
1140 lines (1067 loc) · 31.3 KB

File metadata and controls

1140 lines (1067 loc) · 31.3 KB

Version ArtifactHub License Slack X Reddit

Victoria Metrics Operator

Prerequisites

  • Install the follow packages: git, kubectl, helm, helm-docs. See this tutorial.
  • PV support on underlying infrastructure.

ArgoCD issues

When running operator using ArgoCD without Cert Manager (.Values.admissionWebhooks.certManager.enabled: false) it will rerender webhook certificates on each sync since Helm lookup function is not respected by ArgoCD. To prevent this please update you operator Application spec.syncPolicy and spec.ignoreDifferences with a following:

apiVersion: argoproj.io/v1alpha1
kind: Application
...
spec:
  ...
  destination:
    ...
    namespace: <operator-namespace>
  ...
  syncPolicy:
    syncOptions:
    # https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#respect-ignore-difference-configs
    # argocd must also ignore difference during apply stage
    # otherwise it ll silently override changes and cause a problem
    - RespectIgnoreDifferences=true
  ignoreDifferences:
    - group: ""
      kind: Secret
      name: <fullname>-validation
      namespace: <operator-namespace>
      jsonPointers:
        - /data
    - group: admissionregistration.k8s.io
      kind: ValidatingWebhookConfiguration
      name: <fullname>-admission
      jqPathExpressions:
      - '.webhooks[]?.clientConfig.caBundle'

where <fullname> is output of {{ include "vm-operator.fullname" }} for your setup

Upgrade guide

During release an issue with helm CRD was discovered. So for upgrade from version less then 0.1.3 you have to two options:

  1. use helm management for CRD, enabled by default.
  2. use own management system, need to add variable: --set createCRD=false.

If you choose helm management, following steps must be done before upgrade:

  1. define namespace and helm release name variables
export NAMESPACE=default
export RELEASE_NAME=operator

execute kubectl commands:

kubectl get crd  | grep victoriametrics.com | awk '{print $1 }' | xargs -i kubectl label crd {} app.kubernetes.io/managed-by=Helm --overwrite
kubectl get crd  | grep victoriametrics.com | awk '{print $1 }' | xargs -i kubectl annotate crd {} meta.helm.sh/release-namespace="$NAMESPACE" meta.helm.sh/release-name="$RELEASE_NAME"  --overwrite

run helm upgrade command.

Chart Details

This chart will do the following:

  • Rollout victoria metrics operator

How to install

Access a Kubernetes cluster.

Setup chart repository (can be omitted for OCI repositories)

Add a chart helm repository with follow commands:

helm repo add vm https://victoriametrics.github.io/helm-charts/

helm repo update

List versions of vm/victoria-metrics-operator chart available to installation:

helm search repo vm/victoria-metrics-operator -l

Install victoria-metrics-operator chart

Export default values of victoria-metrics-operator chart to file values.yaml:

  • For HTTPS repository

    helm show values vm/victoria-metrics-operator > values.yaml
  • For OCI repository

    helm show values oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-operator > values.yaml

Change the values according to the need of the environment in values.yaml file.

Test the installation with command:

  • For HTTPS repository

    helm install vmo vm/victoria-metrics-operator -f values.yaml -n NAMESPACE --debug --dry-run
  • For OCI repository

    helm install vmo oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-operator -f values.yaml -n NAMESPACE --debug --dry-run

Install chart with command:

  • For HTTPS repository

    helm install vmo vm/victoria-metrics-operator -f values.yaml -n NAMESPACE
  • For OCI repository

    helm install vmo oci://ghcr.io/victoriametrics/helm-charts/victoria-metrics-operator -f values.yaml -n NAMESPACE

Get the pods lists by running this commands:

kubectl get pods -A | grep 'vmo'

Get the application by running this command:

helm list -f vmo -n NAMESPACE

See the history of versions of vmo application with command.

helm history vmo -n NAMESPACE

Validation webhook

Its possible to use validation of created resources with operator. For now, you need cert-manager to easily certificate management https://cert-manager.io/docs/

admissionWebhooks:
  enabled: true
  # what to do in case, when operator not available to validate request.
  certManager:
    # enables cert creation and injection by cert-manager
    enabled: true

How to uninstall

Remove application with command.

helm uninstall vmo -n NAMESPACE

Documentation of Helm Chart

Install helm-docs following the instructions on this tutorial.

Generate docs with helm-docs command.

cd charts/victoria-metrics-operator

helm-docs

The markdown generation is entirely go template driven. The tool parses metadata from charts and generates a number of sub-templates that can be referenced in a template file (by default README.md.gotmpl). If no template file is provided, the tool has a default internal template that will generate a reasonably formatted README.

Disabling automatic ServiceAccount token mount

There are cases when it is required to disable automatic ServiceAccount token mount due to hardening reasons. To disable it, set the following values:

serviceAccount:
  automountServiceAccountToken: false

extraVolumes:
  - name: operator
    projected:
      sources:
        - downwardAPI:
            items:
              - fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
                path: namespace
        - configMap:
            name: kube-root-ca.crt
        - serviceAccountToken:
            expirationSeconds: 7200
            path: token

extraVolumeMounts:
  - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    name: operator

This configuration disables the automatic ServiceAccount token mount and mounts the token explicitly.

Enable hostNetwork on operator

When running managed Kubernetes such as EKS with custom CNI solution like Cilium or Calico, EKS control plane cannot communicate with CNI's pod CIDR. In that scenario, we need to run webhook service i.e operator with hostNetwork so that it can share node's network namespace.

hostNetwork: true

Parameters

The following tables lists the configurable parameters of the chart and their default values.

Change the values according to the need of the environment in victoria-metrics-operator/values.yaml file.

Key Type Default Description
admissionWebhooks object
certManager:
    enabled: false
    issuer: {}
enabled: true
enabledCRDValidation:
    vlogs: true
    vmagent: true
    vmalert: true
    vmalertmanager: true
    vmalertmanagerconfig: true
    vmauth: true
    vmcluster: true
    vmrule: true
    vmsingle: true
    vmuser: true
keepTLSSecret: true
policy: Fail
tls:
    caCert: null
    cert: null
    key: null

Configures resource validation

admissionWebhooks.certManager object
enabled: false
issuer: {}

Enables custom ca bundle, if you are not using cert-manager. In case of custom ca, you have to create secret - {chart-name}-validation with keys: tls.key, tls.crt, ca.crt

admissionWebhooks.certManager.enabled bool
false

Enables cert creation and injection by cert-manager.

admissionWebhooks.certManager.issuer object
{}

If needed, provide own issuer. Operator will create self-signed if empty.

admissionWebhooks.enabled bool
true

Enables validation webhook.

admissionWebhooks.policy string
Fail

What to do in case, when operator not available to validate request.

affinity object
{}

Pod affinity

annotations object
{}

Annotations to be added to the all resources

crds.cleanup.enabled bool
false

Tells helm to clean up all the vm resources under this release’s namespace when uninstalling

crds.cleanup.image object
pullPolicy: IfNotPresent
repository: bitnami/kubectl
tag: ""

Image configuration for CRD cleanup Job

crds.cleanup.resources object
limits:
    cpu: 500m
    memory: 256Mi
requests:
    cpu: 100m
    memory: 56Mi

Cleanup hook resources

crds.enabled bool
true

manages CRD creation. Disables CRD creation only in combination with crds.plain: false due to helm dependency conditions limitation

crds.plain bool
false

check if plain or templated CRDs should be created. with this option set to false, all CRDs will be rendered from templates. with this option set to true, all CRDs are immutable and require manual upgrade.

env list
[]

Extra settings for the operator deployment. Full list here

envFrom list
[]

Specify alternative source for env variables

extraArgs object
{}

Operator container additional commandline arguments

extraContainers list
[]

Extra containers to run in a pod with operator

extraHostPathMounts list
[]

Additional hostPath mounts

extraLabels object
{}

Labels to be added to the all resources

extraObjects list
[]

Add extra specs dynamically to this chart

extraVolumeMounts list
[]

Extra Volume Mounts for the container

extraVolumes list
[]

Extra Volumes for the pod

fullnameOverride string
""

Overrides the full name of server component resources

global.cluster.dnsDomain string
cluster.local.

K8s cluster domain suffix, uses for building storage pods’ FQDN. Details are here

global.compatibility object
openshift:
    adaptSecurityContext: auto

Openshift security context compatibility configuration

global.image.registry string
""

Image registry, that can be shared across multiple helm charts

global.imagePullSecrets list
[]

Image pull secrets, that can be shared across multiple helm charts

hostNetwork bool
false

Enable hostNetwork on operator deployment

image object
pullPolicy: IfNotPresent
registry: ""
repository: victoriametrics/operator
tag: ""
variant: ""

operator image configuration

image.pullPolicy string
IfNotPresent

Image pull policy

image.registry string
""

Image registry

image.repository string
victoriametrics/operator

Image repository

image.tag string
""

Image tag override Chart.AppVersion

imagePullSecrets list
[]

Secret to pull images

lifecycle object
{}

Operator lifecycle. See this article for details.

logLevel string
info

VM operator log level. Possible values: info and error.

nameOverride string
""

Override chart name

nodeSelector object
{}

Pod’s node selector. Details are here

operator.disable_prometheus_converter bool
false

By default, operator converts prometheus-operator objects.

operator.enable_converter_ownership bool
false

Enables ownership reference for converted prometheus-operator objects, it will remove corresponding victoria-metrics objects in case of deletion prometheus one.

operator.prometheus_converter_add_argocd_ignore_annotations bool
false

Compare-options and sync-options for prometheus objects converted by operator for properly use with ArgoCD

operator.useCustomConfigReloader bool
false

Enables custom config-reloader, bundled with operator. It should reduce vmagent and vmauth config sync-time and make it predictable.

podDisruptionBudget object
enabled: false
labels: {}

See kubectl explain poddisruptionbudget.spec for more or check these docs

podLabels object
{}

extra Labels for Pods only

podSecurityContext object
enabled: true

Pod’s security context. Details are here

probe.liveness object
failureThreshold: 3
initialDelaySeconds: 5
periodSeconds: 15
tcpSocket:
    port: probe
timeoutSeconds: 5

Liveness probe

probe.readiness object
failureThreshold: 3
httpGet:
    port: probe
initialDelaySeconds: 5
periodSeconds: 15
timeoutSeconds: 5

Readiness probe

probe.startup object
{}

Startup probe

rbac.aggregatedClusterRoles object
enabled: true
labels:
    admin:
        rbac.authorization.k8s.io/aggregate-to-admin: "true"
    view:
        rbac.authorization.k8s.io/aggregate-to-view: "true"

Create aggregated clusterRoles for CRD readonly and admin permissions

rbac.aggregatedClusterRoles.labels object
admin:
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
view:
    rbac.authorization.k8s.io/aggregate-to-view: "true"

Labels attached to according clusterRole

rbac.create bool
true

Specifies whether the RBAC resources should be created

replicaCount int
1

Number of operator replicas

resources object
{}

Resource object

securityContext object
enabled: true

Security context to be added to server pods

service.annotations object
{}

Service annotations

service.clusterIP string
""

Service ClusterIP

service.externalIPs string
""

Service external IPs. Check here for details

service.externalTrafficPolicy string
""

Service external traffic policy. Check here for details

service.healthCheckNodePort string
""

Health check node port for a service. Check here for details

service.ipFamilies list
[]

List of service IP families. Check here for details.

service.ipFamilyPolicy string
""

Service IP family policy. Check here for details.

service.labels object
{}

Service labels

service.loadBalancerIP string
""

Service load balancer IP

service.loadBalancerSourceRanges list
[]

Load balancer source range

service.servicePort int
8080

Service port

service.type string
ClusterIP

Service type

service.webhookPort int
9443

Service webhook port

serviceAccount.automountServiceAccountToken bool
true

Whether to automount the service account token. Note that token needs to be mounted manually if this is disabled.

serviceAccount.create bool
true

Specifies whether a service account should be created

serviceAccount.name string
""

The name of the service account to use. If not set and create is true, a name is generated using the fullname template

serviceMonitor object
annotations: {}
basicAuth: {}
enabled: false
extraLabels: {}
interval: ""
relabelings: []
scheme: ""
scrapeTimeout: ""
tlsConfig: {}

Configures monitoring with serviceScrape. VMServiceScrape must be pre-installed

terminationGracePeriodSeconds int
30

Graceful pod termination timeout. See this article for details.

tolerations list
[]

Array of tolerations object. Spec is here

topologySpreadConstraints list
[]

Pod Topology Spread Constraints. Spec is here

watchNamespaces list
[]

By default, the operator will watch all the namespaces If you want to override this behavior, specify the namespace. Operator supports multiple namespaces for watching.