Skip to content
This repository has been archived by the owner on Mar 26, 2020. It is now read-only.

GlusterD kubernetes: systemctl start glusterd silent failures. #1496

Open
jayunit100 opened this issue Feb 4, 2019 · 1 comment
Open

GlusterD kubernetes: systemctl start glusterd silent failures. #1496

jayunit100 opened this issue Feb 4, 2019 · 1 comment

Comments

@jayunit100
Copy link

Note: I didn't setup an ETCD url. I assume that either way, glusterd should fail fast and obviously if ETCD isnt working, however, its a silent failure.

Observed behavior

Running the kube cluster recipes Gluster pods are running and healthy, but systemctl status glusterd2 tells another story, it completely failed.

Expected/desired behavior

Pods should exit if glusterd can't startup, or at least log this to stderr. Right now no logs and only way to know its broken is to run glustercli peer status or similar inside the pod.

Details on how to reproduce (minimal and precise)

Create the following file:

---
apiVersion: v1
kind: Namespace
metadata:
  name: gluster-storage
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: gluster
  namespace: gluster-storage
  labels:
    gluster-storage: glusterd2
spec:
  selector:
    matchLabels:
      name: glusterd2-daemon
  template:
    metadata:
      labels:
        name: glusterd2-daemon
    spec:
      containers:
        - name: glusterd2
          image: docker.io/gluster/glusterd2-nightly:20190204
# TODO: Enable the below once passing environment variables to the containers is fixed
#          env:
#            - name: GD2_RESTAUTH
#              value: "false"
# Enable if an external etcd cluster has been set up etcd
#            - name: GD2_ETCDENDPOINTS
#              value: "http://gluster-etcd:2379"
# Generate and set a random uuid here
#            - name: GD2_CLUSTER_ID
#              value: "9610ec0b-17e7-405e-82f7-5f78d0b22463"
          securityContext:
            capabilities: {}
            privileged: true
          volumeMounts:
            - name: gluster-dev
              mountPath: "/dev"
            - name: gluster-cgroup
              mountPath: "/sys/fs/cgroup"
              readOnly: true
            - name: gluster-lvm
              mountPath: "/run/lvm"
            - name: gluster-kmods
              mountPath: "/usr/lib/modules"
              readOnly: true

      volumes:
        - name: gluster-dev
          hostPath:
            path: "/dev"
        - name: gluster-cgroup
          hostPath:
            path: "/sys/fs/cgroup"
        - name: gluster-lvm
          hostPath:
            path: "/run/lvm"
        - name: gluster-kmods
          hostPath:
            path: "/usr/lib/modules"

---
apiVersion: v1
kind: Service
metadata:
  name: glusterd2-service
  namespace: gluster-storage
spec:
  selector:
    name: glusterd2-daemon
  ports:
    - protocol: TCP
      port: 24007
      targetPort: 24007
# GD2 will be available on kube-host:31007 externally
      nodePort: 31007
  type: NodePort

And exec -t -i into one of the pods, you'll see its healthy, but running systemctl status glusterd2 will show error logs. re running this command manually, you will then see the following logs

WARNING: 2019/02/04 19:43:51 grpc: addrConn.createTransport failed to connect to {[fe80::345c:baff:fefe:edc6]:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp [fe80::345c:baff:fefe:edc6]:2379: connect: invalid argument". Reconnecting...
WARNING: 2019/02/04 19:43:51 grpc: addrConn.createTransport failed to connect to {[fe80::345c:baff:fefe:edc6]:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp [fe80::345c:baff:fefe:edc6]:2379: connect: invalid argument". Reconnecting...

@Madhu-1
Copy link
Member

Madhu-1 commented Feb 5, 2019

@jayunit100 I don't see below part of the code in your template, which is responsible for the health check

livenessProbe:
            httpGet:
              path: /ping
              port: 24007
            initialDelaySeconds: 10
            periodSeconds: 60

please refer https://github.com/gluster/gcs/blob/master/deploy/templates/gcs-manifests/gcs-gd2.yml.j2 for more info

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants