Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog latest docker images (Build Code bbrw4qxdfz7kfgxcjznivpb) crash immediately on bootup #296

Closed
AngerM opened this issue Mar 15, 2018 · 4 comments
Assignees

Comments

@AngerM
Copy link

AngerM commented Mar 15, 2018

Additional environment details (Operating System, Cloud provider, etc):
We are running on GKE

Steps to reproduce the issue:

  1. Create a daemonset tagged to latest (at the time it was the image now tagged as 12.5.5223)
  2. Update the daemonset's yaml file
  3. When pods start rolling with the new latest (build code bbrw4qxdfz7kfgxcjznivpb) they crash immediately on bootup

Additional information you deem important (e.g. issue happens only occasionally):
There were a bunch of python error messages (missing method on some type of object, related to api key). I unfortunately rolled the cluster from latest to the 12.5.5223 tag as soon as I saw the crashes happening so I no longer have the logs.

My daemonset yaml file was as follows:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  creationTimestamp: 2018-03-05T05:01:16Z
  generation: 3
  labels:
    app: dd-agent
  name: dd-agent
  namespace: default
  resourceVersion: "8621119"
  selfLink: /apis/extensions/v1beta1/namespaces/default/daemonsets/dd-agent
  uid: <REMOVED>
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: dd-agent
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: dd-agent
      name: dd-agent
    spec:
      containers:
      - env:
        - name: API_KEY
          value: <REMOVED>
        - name: KUBERNETES
          value: "yes"
        - name: SD_BACKEND
          value: docker
        image: datadog/docker-dd-agent:latest
        imagePullPolicy: Always
        livenessProbe:
          exec:
            command:
            - ./probe.sh
          failureThreshold: 3
          initialDelaySeconds: 15
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 1
        name: dd-agent
        ports:
        - containerPort: 8125
          hostPort: 8125
          name: dogstatsdport
          protocol: UDP
        resources:
          limits:
            cpu: "1"
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 128Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/run/docker.sock
          name: dockersocket
        - mountPath: /host/proc
          name: procdir
          readOnly: true
        - mountPath: /host/sys/fs/cgroup
          name: cgroups
          readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /var/run/docker.sock
          type: ""
        name: dockersocket
      - hostPath:
          path: /proc
          type: ""
        name: procdir
      - hostPath:
          path: /sys/fs/cgroup
          type: ""
        name: cgroups
  templateGeneration: 3
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

the current yaml that is working is identical with the exception that

image: datadog/docker-dd-agent:latest

was changed to

image: datadog/docker-dd-agent:12.5.5223
@mattbarrio
Copy link

mattbarrio commented Mar 15, 2018

Looks to be related to the 12.5.5223 (#291) release like you said - @AngerM
We've reverted back to 12.5.5221

datadog    | Traceback (most recent call last):
datadog    |   File "/config_builder.py", line 171, in <module>
datadog    |     cfg.build_datadog_conf()
datadog    |   File "/config_builder.py", line 58, in build_datadog_conf
datadog    |     self.set_api_key()
datadog    |   File "/config_builder.py", line 110, in set_api_key
datadog    |     self.set_property('api_key', api_key)
datadog    |   File "/config_builder.py", line 162, in set_property
datadog    |     if not self.config.has_selection(section):
datadog    | AttributeError: ConfigParser instance has no attribute 'has_selection'

@AngerM
Copy link
Author

AngerM commented Mar 15, 2018

Yep @mattbarrio that looks like it!

@hkaj
Copy link
Member

hkaj commented Mar 15, 2018

Hi @AngerM , @mattbarrio
sorry for the mess, we're on it.

@hkaj hkaj self-assigned this Mar 15, 2018
@hkaj hkaj closed this as completed in #297 Mar 15, 2018
@hkaj
Copy link
Member

hkaj commented Mar 15, 2018

should be fixed now, latest is rebuilding. Thanks for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants