Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Docker based) Elastic Agent's Metricbeat / Filebeat cannot connect to ES #20759

Closed
EricDavisX opened this issue Aug 24, 2020 · 12 comments
Closed
Assignees
Labels
bug Ingest Management:beta2 Group issues for ingest management beta2

Comments

@EricDavisX
Copy link
Contributor

7.9 Beta 1 release

  • Operating System: Linux

  • Steps to Reproduce:
    run the e2e-testing framework Ingest Manager 7.x (branch) tests
    the tests work on 8.0 / master branch. which is confusing us currently.

Its possible this is a failure in the test to set up the configuration correctly, we can look at those steps with Manu (reporting on behalf of the e2e-testing framework that he and I and Ingest team are working). I want to get it in to track before losing it.

complaint:
I try to enroll an agent running 7.9.0, it never reaches the online status, it always stays into the Enrolling one

debugging so far:
Agent log snippet:
{"log.level":"debug","@timestamp":"2020-08-21T12:05:59.079Z","log.origin":{"file.name":"application/periodic.go","file.line":40},"message":"Failed to read configuration, error: could not emit configuration: fail to extract program configuration: invalid configuration missing outputs configuration: /go/src/github.com/elastic/beats/x-pack/elastic-agent/pkg/agent/program/program.go[123]: unknown error","ecs.version":"1.5.0"}

@mdelapenya about this comment:
we use the 'hostname' command in the container to retrieve the host name, and that value is reflected in the UI

  • can you confirm the port is right too?

From Manu: it seems that both metricbeat and filebeat are not able to connect to ES, as by default, the config file for the agent comes with 127.0.0.1:9200 hardcoded in the /etc/elastic-agent/elastic-agent.yml config file

I’m curious how this process is telling the beats how to connect to the ES because in 7.9, it uses 127.0.0.1

from what I can observe:
in 8.0.0-SNAPSHOT, Fleet overrides elastic-agent.yml config file, AND FB/MB are able to discover elasticsearch, which runs in http://elasticsearch:9200
in 7.9.0, Fleet overrides elastic-agent.yml config file, BUT FB/MB are not able to discover elasticsearch, as http://127.0.0.1 is used

Using lsof I can check that there are no out-of-the-box connections to the elasticsearch instance in the 7.9.x, but they do exist in 8.0.0-snapshot
Screenshot 2020-08-24 at 17 14 14

@EricDavisX EricDavisX added Team:Ingest Management Ingest Management:beta2 Group issues for ingest management beta2 labels Aug 24, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@ph
Copy link
Contributor

ph commented Aug 24, 2020

@mdelapenya If you log into the ingest manager can you add the raw yaml of the agent to this issue?
The Agent will fails if no input output is defined, this is what you see in the log statement. By looking at the context I have the following ideas:

  • The Ingest manager setup has issues.
  • OR Fleet is not able to generate a configuration with the output.
  • The output is present but the Elastic Agent has a problem understanding the configuration

@jfsiii and @michalpristas Can you take a look?

@ph ph assigned jfsiii and unassigned ph Aug 24, 2020
@EricDavisX
Copy link
Contributor Author

Its using the default ootb config. we can check the (global to Ingest) ‘settings’ during the test to confirm what is set, but I’m unsure of what is exactly right in this Docker Containerized setup…

GET the Kibana setting:
api/ingest_manager/settings
GET the ES Output settings:
api/ingest_manager/outputs

we may need to explicitly set that in the test.

to set new values do a PUT to
api/ingest_manager/outputs/58b8a4c1-ddf9-4187-8ed7-9dba53879607
where 58b... is the output value from the outputs GET call above
the body is, for example, for my cloud instance:
{"hosts":["https://abcabbf7e34147558a8feab812341234.us-central1.gcp.foundit.no:443"]}

we can run it to get the values. @mdelapenya

@EricDavisX
Copy link
Contributor Author

here are some screenshots of the Ingest UI:

and a cat of the Fleet.yml file from the agent host in the test:

[root@5412a60e4adb /]# find / -name *fleet*                     
/etc/elastic-agent/fleet.yml
[root@5412a60e4adb /]# cat /etc/elastic-agent/fleet.yml
agent:
  id: 58053e4e-8fd9-4b3c-b5ab-1f2aa8511fd0
fleet:
  enabled: true
  access_api_key: WHNzdEluUUJNeGt6VGctYm91elo6OUNsZThuREFRbnVrcG1kZWxWUUNUQQ==
  kibana:
    protocol: http
    host: kibana:5601
    timeout: 1m30s
    ssl:
      verification_mode: none
      renegotiation: never
  reporting:
    threshold: 10000
    check_frequency_sec: 30
  agent:
    id: ""

and here is the /etc/elastic-agent/elastic-agent.yml

[root@fb664e075705 /]# cat /etc/elastic-agent/elastic-agent.yml
# ================================ General =====================================
# Beats is configured under Fleet, you can define most settings
# from the Kibana UI. You can update this file to configure the settings that
# are not supported by Fleet.
fleet:
  enabled: true

# agent.download:
#   # source of the artifacts, requires elastic like structure and naming of the binaries
#   # e.g /windows-x86.zip
#   sourceURI: "https://artifacts.elastic.co/downloads/beats/"
#   # path to the directory containing downloaded packages
#   target_directory: "${path.data}/downloads"
#   # timeout for downloading package
#   timeout: 30s
#   # file path to a public key used for verifying downloaded artifacts
#   # if not file is present Elastic Agent will try to load public key from elastic.co website.
#   pgpfile: "${path.data}/elastic.pgp"
#   # install_path describes the location of installed packages/programs. It is also used
#   # for reading program specifications.
#   install_path: "${path.data}/install"

# agent.process:
#   # minimal port number for spawned processes
#   min_port: 10000
#   # maximum port number for spawned processes
#   max_port: 30000
#   # timeout for creating new processes. when process is not successfully created by this timeout
#   # start operation is considered a failure
#   spawn_timeout: 30s

# agent.retry:
#   # enabled determines whether retry is possible. Default is false.
#   enabled: true
#   # retries_count specifies number of retries. Default is 3.
#   # Retry count of 1 means it will be retried one time after one failure.
#   retries_count: 3
#   # delay specifies delay in ms between retries. Default is 30s
#   delay: 30s
#   # max_delay specifies maximum delay in ms between retries. Default is 300s
#   max_delay: 5m
#   # Exponential determines whether delay is treated as exponential.
#   # With 30s delay and 3 retries: 30, 60, 120s
#   # Default is false
#   exponential: false

and here is the reference.yml:

[root@fb664e075705 /]# cat /etc/elastic-agent/elastic-agent.reference.yml 
###################### Agent Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The elastic-agent.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.

######################################
# Fleet configuration
######################################
outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    username: elastic
    password: changeme

inputs:
  - type: system/metrics

    # The only two requirement are that it has only characters allowed in an Elasticsearch index name
    # Index names must meet the following criteria:
    #   Lowercase only
    #   Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #
    #   Cannot start with -, _, +
    #   Cannot be . or ..
    data_stream.namespace: default
    use_output: default
    streams:
      - metricset: cpu
        # The only two requirement are that it has only characters allowed in an Elasticsearch index name
        # Index names must meet the following criteria:
        #   Lowercase only
        #   Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #
        #   Cannot start with -, _, +
        #   Cannot be . or ..
        data_stream.dataset: system.cpu
      - metricset: memory
        data_stream.dataset: system.memory
      - metricset: network
        data_stream.dataset: system.network
      - metricset: filesystem
        data_stream.dataset: system.filesystem

# management:
#   # Mode of management, the Elastic Agent support two modes of operation:
#   #
#   # local: The Elastic Agent will expect to find the inputs configuration in the local file.
#   #
#   # Default is local.
#   mode: "local"

# fleet:
#   access_api_key: ""
#   kibana:
#     # kibana minimal configuration
#     hosts: ["localhost:5601"]
#     ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

#     # optional values
#     #protocol: "https"
#     #username: "elastic"
#     #password: "changeme"
#     #path: ""
#     #ssl.verification_mode: full
#     #ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]
#     #ssl.cipher_suites: []
#     #ssl.curve_types: []
#   reporting:
#     # Reporting threshold indicates how many events should be kept in-memory before reporting them to fleet.
#     #reporting_threshold: 10000
#     # Frequency used to check the queue of events to be sent out to fleet.
#     #reporting_check_frequency_sec: 30

# agent.download:
#   # source of the artifacts, requires elastic like structure and naming of the binaries
#   # e.g /windows-x86.zip
#   sourceURI: "https://artifacts.elastic.co/downloads/beats/"
#   # path to the directory containing downloaded packages
#   target_directory: "${path.data}/downloads"
#   # timeout for downloading package
#   timeout: 30s
#   # file path to a public key used for verifying downloaded artifacts
#   # if not file is present agent will try to load public key from elastic.co website.
#   pgpfile: "${path.data}/elastic.pgp"
#   # install_path describes the location of installed packages/programs. It is also used
#   # for reading program specifications.
#   install_path: "${path.data}/install"

# agent.process:
#   # timeout for creating new processes. when process is not successfully created by this timeout
#   # start operation is considered a failure
#   spawn_timeout: 30s
#   # timeout for stopping processes. when process is not stopped by this timeout then the process.
#   # is force killed
#   stop_timeout: 30s

# agent.grpc:
#   # listen address for the GRPC server that spawned processes connect back to.
#   address: localhost
#   # port for the GRPC server that spawned processes connect back to.
#   port: 6789

# agent.retry:
#   # Enabled determines whether retry is possible. Default is false.
#   enabled: true
#   # RetriesCount specifies number of retries. Default is 3.
#   # Retry count of 1 means it will be retried one time after one failure.
#   retriesCount: 3
#   # Delay specifies delay in ms between retries. Default is 30s
#   delay: 30s
#   # MaxDelay specifies maximum delay in ms between retries. Default is 300s
#   maxDelay: 5m
#   # Exponential determines whether delay is treated as exponential.
#   # With 30s delay and 3 retries: 30, 60, 120s
#   # Default is false
#   exponential: false

# agent.monitoring:
#   # enabled turns on monitoring of running processes
#   enabled: false
#   # enables log monitoring
#   logs: false
#   # enables metrics monitoring
#   metrics: false

# # Allow fleet to reload his configuration locally on disk.
# # Notes: Only specific process configuration will be reloaded.
# agent.reload:
#   # enabled configure the Elastic Agent to reload or not the local configuration.
#   #
#   # Default is true
#   enabled: true

#   # period define how frequent we should look for changes in the configuration.
#   period: 10s

# Logging

# There are four options for the log output: file, stderr, syslog, eventlog
# The file output is the default.

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#agent.logging.level: info

# Enable debug output for selected components. To enable all selectors use ["*"]
# Other available selectors are "beat", "publish", "service"
# Multiple selectors can be chained.
#agent.logging.selectors: [ ]

# Send all logging output to stderr. The default is false.
agent.logging.to_stderr: true

# Send all logging output to syslog. The default is false.
#agent.logging.to_syslog: false

# Send all logging output to Windows Event Logs. The default is false.
#agent.logging.to_eventlog: false

# If enabled, Elastic-Agent periodically logs its internal metrics that have changed
# in the last period. For each metric that changed, the delta from the value at
# the beginning of the period is logged. Also, the total values for
# all non-zero internal metrics are logged on shutdown. The default is true.
#agent.logging.metrics.enabled: true

# The period after which to log the internal metrics. The default is 30s.
#agent.logging.metrics.period: 30s

# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
#agent.logging.to_files: true
#agent.logging.files:
  # Configure the path where the logs are written. The default is the logs directory
  # under the home path (the binary location).
  #path: /var/log/elastic-agent

  # The name of the files where the logs are written to.
  #name: elastic-agent

  # Configure log file size limit. If limit is reached, log file will be
  # automatically rotated
  #rotateeverybytes: 10485760 # = 10MB

  # Number of rotated log files to keep. Oldest files will be deleted first.
  #keepfiles: 7

  # The permissions mask to apply when rotating log files. The default value is 0600.
  # Must be a valid Unix-style file permissions mask expressed in octal notation.
  #permissions: 0600

  # Enable log file rotation on time intervals in addition to size-based rotation.
  # Intervals must be at least 1s. Values of 1m, 1h, 24h, 7*24h, 30*24h, and 365*24h
  # are boundary-aligned with minutes, hours, days, weeks, months, and years as
  # reported by the local system clock. All other intervals are calculated from the
  # Unix epoch. Defaults to disabled.
  #interval: 0

  # Rotate existing logs on startup rather than appending to the existing
  # file. Defaults to true.
  # rotateonstartup: true

# Set to true to log messages in JSON format.
#agent.logging.json: false

# Set to true, to log messages with minimal required Elastic Common Schema (ECS)
# information. Recommended to use in combination with `logging.json=true`
# Defaults to false.
#agent.logging.ecs: false

@EricDavisX
Copy link
Contributor Author

From PH's assessment, it looks like it is not option 3. There is no output present in the .yml so the problem is either in the test still (running too fast, maybe Fleet isn't finished setting up?), or on the Fleet side in product code.

@EricDavisX
Copy link
Contributor Author

so, added a 'sleep' after it posts the enroll command, and I ran the test and when it was paused I force-unenrolled the agent that was attempted. I then docker exec -it ingest-manager_elastic-agent_1 /bin/bash to get in and manually posted the enroll command, as the test had done and found the same behavior.

just for reference:
elastic-agent enroll http://kibana:5601 dU10RUluUUJNeGt6VGctYm9mRW46TERUOUkwWlBRVDZOTGF6M0JnVEhIdw== --insecure
a systemctl status elastic-agent shows:
Aug 24 21:07:59 761d4d73bb09 elastic-agent[57]: 2020-08-24T21:07:59.450Z DEBUG application/periodic.go:40 Failed to read configuration, error: could not emit configuration: fail to extract program configuration: invalid configuration missing outputs configuration: /go/src/github.com/elastic/beats/x-pack/elastic-agent/pkg/agent/program/program.go[123]: unknown error

Is this at all somehow relating to the certs problem we've seen in 7.9 usage?
elastic/kibana#73483
#19504

Cause I was curious of commit state, the Agent is 7.9 GA and Kibana is at this commit, which I guess looks right (for 7.9.x branch) as seen from the /status api call:

edavis-mbp:kibana_elastic edavis$ git show -s 095c1cec
commit 095c1cec623b89c03306ef46becbc230597c0e47
Author: Sandra Gonzales <[email protected]>
Date:   Tue Aug 11 13:28:37 2020 -0500

    remove events from indexPatternTypes (#74754)

@EricDavisX
Copy link
Contributor Author

fyi - don't forget to checkout the '7.9.x' branch when researching, and then set:
export DEVELOPER_MODE=true
and then I modified the fleet.go file
to put a sleep command just before the return in:
func (fts *FleetTestSuite) anAgentIsDeployedToFleet(image string)
as:
time.Sleep(10 * time.Minute)

and then:
docker stop $(docker ps -a -q) &amp;&amp; docker rm $(docker ps -a -q) && docker rmi $(docker images -qa)

then
godog -t @enroll


I tried to run the 7.9.X branch from cloud and its a different commit that's 11 days later of Kibana, so I don't know if we should try testing / validating on master too?

To that end, I tried to pull master and ended up getting 7.9 code pulled, @mdelapenya so I'm not sure if the 'master' line of the e2e-testing repo is set right or if its me. I have the export variable set an it still pulled some 7.9 versions so... we can probably focus there until we get the framework figured out.

@mdelapenya
Copy link
Contributor

Hey @EricDavisX, here they are the values:

GET the Kibana setting:
api/ingest_manager/settings

{"success":true,"item":{"id":"f10dbc30-e69b-11ea-84b5-835bc3e12d87","agent_auto_upgrade":true,"package_auto_upgrade":true,"kibana_url":"http://kibana:5601"}}

GET the ES Output settings:
api/ingest_manager/outputs

{"items":[{"id":"e3816ffb-aa96-4571-a9c3-89f5149bb599","name":"default","is_default":true,"type":"elasticsearch","hosts":["http://elasticsearch:9200"]}],"page":1,"perPage":1000,"total":1,"success":true}

It seems the ES outputs are properly set in the configuration.

@mdelapenya
Copy link
Contributor

mdelapenya commented Aug 25, 2020

I can confirm that, for 8.0.0-SNAPSHOT, the agent receives this configuration (without outputs):

# ================================ General =====================================
# Beats is configured under Fleet, you can define most settings
# from the Kibana UI. You can update this file to configure the settings that
# are not supported by Fleet.
fleet:
  enabled: true

# agent.download:
#   # source of the artifacts, requires elastic like structure and naming of the binaries
#   # e.g /windows-x86.zip
#   sourceURI: "https://artifacts.elastic.co/downloads/beats/"
#   # path to the directory containing downloaded packages
#   target_directory: "${path.data}/downloads"
#   # timeout for downloading package
#   timeout: 30s
#   # file path to a public key used for verifying downloaded artifacts
#   # if not file is present Elastic Agent will try to load public key from elastic.co website.
#   pgpfile: "${path.data}/elastic.pgp"
#   # install_path describes the location of installed packages/programs. It is also used
#   # for reading program specifications.
#   install_path: "${path.data}/install"

# agent.process:
#   # minimal port number for spawned processes
#   min_port: 10000
#   # maximum port number for spawned processes
#   max_port: 30000
#   # timeout for creating new processes. when process is not successfully created by this timeout
#   # start operation is considered a failure
#   spawn_timeout: 30s

# agent.retry:
#   # enabled determines whether retry is possible. Default is false.
#   enabled: true
#   # retries_count specifies number of retries. Default is 3.
#   # Retry count of 1 means it will be retried one time after one failure.
#   retries_count: 3
#   # delay specifies delay in ms between retries. Default is 30s
#   delay: 30s
#   # max_delay specifies maximum delay in ms between retries. Default is 300s
#   max_delay: 5m
#   # Exponential determines whether delay is treated as exponential.
#   # With 30s delay and 3 retries: 30, 60, 120s
#   # Default is false
#   exponential: false

I'm going to perform these check:

  1. start the stack in 8.0.0-snapshot
  2. run the test for a 7.9.0 agent

and

  1. start the stack in 7.9.0
  2. run the test for a 8.0.0-snapshot agent

@mdelapenya
Copy link
Contributor

mdelapenya commented Aug 25, 2020

Use case 1: 8.0.0-snapshot stack + 8.0.0-snapshot agent

  • Stack (Kibana + ES): 8.0.0-snapshot
  • Agent: 8.0.0-SNAPSHOT

Run tests for the enroll scenario

$ git checkout master
$ OP_LOG_LEVEL=DEBUG godog -t "fleet_mode && enroll"

The agent is enrolled and shown as Online in Fleet

Use case 2: 7.9.0 stack + 7.9.0 agent

  • Stack (Kibana + ES): 7.9.0
  • Agent: 7.9.0

Run tests for the enroll scenario

$ git checkout 7.9.x
$ OP_LOG_LEVEL=DEBUG godog -t "fleet_mode && enroll"

The agent is not enrolled and shown as "Enrolling" in Fleetv

Use case 3: 8.0.0-snapshot stack + 7.9.0 agent

  • Stack (Kibana + ES): 8.0.0-SNAPSHOT
  • Agent: 7.9.0

Run tests for the enroll scenario

$ git checkout master
$ OP_LOG_LEVEL=DEBUG ELASTIC_AGENT_VERSION=7.9.0 godog -t "fleet_mode && enroll"

elastic-agent.yml

# ================================ General =====================================
# Beats is configured under Fleet, you can define most settings
# from the Kibana UI. You can update this file to configure the settings that
# are not supported by Fleet.
fleet:
  enabled: true

# agent.download:
#   # source of the artifacts, requires elastic like structure and naming of the binaries
#   # e.g /windows-x86.zip
#   sourceURI: "https://artifacts.elastic.co/downloads/beats/"
#   # path to the directory containing downloaded packages
#   target_directory: "${path.data}/downloads"
#   # timeout for downloading package
#   timeout: 30s
#   # file path to a public key used for verifying downloaded artifacts
#   # if not file is present Elastic Agent will try to load public key from elastic.co website.
#   pgpfile: "${path.data}/elastic.pgp"
#   # install_path describes the location of installed packages/programs. It is also used
#   # for reading program specifications.
#   install_path: "${path.data}/install"

# agent.process:
#   # minimal port number for spawned processes
#   min_port: 10000
#   # maximum port number for spawned processes
#   max_port: 30000
#   # timeout for creating new processes. when process is not successfully created by this timeout
#   # start operation is considered a failure
#   spawn_timeout: 30s

# agent.retry:
#   # enabled determines whether retry is possible. Default is false.
#   enabled: true
#   # retries_count specifies number of retries. Default is 3.
#   # Retry count of 1 means it will be retried one time after one failure.
#   retries_count: 3
#   # delay specifies delay in ms between retries. Default is 30s
#   delay: 30s
#   # max_delay specifies maximum delay in ms between retries. Default is 300s
#   max_delay: 5m
#   # Exponential determines whether delay is treated as exponential.
#   # With 30s delay and 3 retries: 30, 60, 120s
#   # Default is false
#   exponential: false

fleet.yml

agent:
  id: 953f1443-8490-43d1-ac1d-0f714d919071
fleet:
  enabled: true
  access_api_key: ZjF0ZEpIUUIxcTdrQWZmQS00U0U6MFVQWDliX0RTdHE4czcyYTJBSjhxdw==
  kibana:
    protocol: http
    host: kibana:5601
    timeout: 1m30s
    ssl:
      verification_mode: none
      renegotiation: never
  reporting:
    threshold: 10000
    check_frequency_sec: 30
  agent:
    id: ""

Use case 4: 7.9.0 stack + 8.0.0-snapshot agent

  • Stack (Kibana + ES): 7.9.0
  • Agent: 8.0.0-SNAPSHOT

Run tests for the enroll scenario

$ git checkout 7.9.x
$ OP_LOG_LEVEL=DEBUG ELASTIC_AGENT_VERSION=8.0.0-SNAPSHOT godog -t "fleet_mode && enroll"

The enrollment process fails because the version is not compatible:

2020-08-25T06:58:07.873Z DEBUG kibana/client.go:170 Request method: POST, path: /api/ingest_manager/fleet/agents/enroll
fail to enroll: fail to execute request to Kibana: Status code: 400, Kibana returned an error: Bad Request, message: Agent version is not compatible with kibana version

elastic-agent.yml

Not configured by Fleet

###################### Agent Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The elastic-agent.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.

######################################
# Fleet configuration
######################################
outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    username: elastic
    password: changeme

inputs:
  - type: system/metrics

    # The only two requirement are that it has only characters allowed in an Elasticsearch index name
    # Index names must meet the following criteria:
    #   Lowercase only
    #   Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #
    #   Cannot start with -, _, +
    #   Cannot be . or ..
    data_stream.namespace: default
    use_output: default
    streams:
      - metricset: cpu
        # The only two requirement are that it has only characters allowed in an Elasticsearch index name
        # Index names must meet the following criteria:
        #   Lowercase only
        #   Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #
        #   Cannot start with -, _, +
        #   Cannot be . or ..
        data_stream.dataset: system.cpu
      - metricset: memory
        data_stream.dataset: system.memory
      - metricset: network
        data_stream.dataset: system.network
      - metricset: filesystem
        data_stream.dataset: system.filesystem

# agent.monitoring:
#   # enabled turns on monitoring of running processes
#   enabled: true
#   # enables log monitoring
#   logs: true
#   # enables metrics monitoring
#   metrics: true

# # Allow fleet to reload his configuration locally on disk.
# # Notes: Only specific process configuration will be reloaded.
# agent.reload:
#   # enabled configure the Elastic Agent to reload or not the local configuration.
#   #
#   # Default is true
#   enabled: true

#   # period define how frequent we should look for changes in the configuration.
#   period: 10s

# management:
#   # Mode of management, the Elastic Agent support two modes of operation:
#   #
#   # local: The Elastic Agent will expect to find the inputs configuration in the local file.
#   #
#   # Default is local.
#   mode: "local"

# fleet:
#   access_api_key: ""
#   kibana:
#     # kibana minimal configuration
#     hosts: ["localhost:5601"]
#     ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

#     # optional values
#     #protocol: "https"
#     #username: "elastic"
#     #password: "changeme"
#     #path: ""
#     #ssl.verification_mode: full
#     #ssl.supported_protocols: [TLSv1.0, TLSv1.1, TLSv1.2]
#     #ssl.cipher_suites: []
#     #ssl.curve_types: []
#   reporting:
#     # Reporting threshold indicates how many events should be kept in-memory before reporting them to fleet.
#     #reporting_threshold: 10000
#     # Frequency used to check the queue of events to be sent out to fleet.
#     #reporting_check_frequency_sec: 30

# agent.download:
#   # source of the artifacts, requires elastic like structure and naming of the binaries
#   # e.g /windows-x86.zip
#   sourceURI: "https://artifacts.elastic.co/downloads/beats/"
#   # path to the directory containing downloaded packages
#   target_directory: "${path.data}/downloads"
#   # timeout for downloading package
#   timeout: 30s
#   # file path to a public key used for verifying downloaded artifacts
#   # if not file is present agent will try to load public key from elastic.co website.
#   pgpfile: "${path.data}/elastic.pgp"
#   # install_path describes the location of installed packages/programs. It is also used
#   # for reading program specifications.
#   install_path: "${path.data}/install"

# agent.process:
#   # timeout for creating new processes. when process is not successfully created by this timeout
#   # start operation is considered a failure
#   spawn_timeout: 30s
#   # timeout for stopping processes. when process is not stopped by this timeout then the process.
#   # is force killed
#   stop_timeout: 30s

# agent.grpc:
#   # listen address for the GRPC server that spawned processes connect back to.
#   address: localhost
#   # port for the GRPC server that spawned processes connect back to.
#   port: 6789

# agent.retry:
#   # Enabled determines whether retry is possible. Default is false.
#   enabled: true
#   # RetriesCount specifies number of retries. Default is 3.
#   # Retry count of 1 means it will be retried one time after one failure.
#   retriesCount: 3
#   # Delay specifies delay in ms between retries. Default is 30s
#   delay: 30s
#   # MaxDelay specifies maximum delay in ms between retries. Default is 300s
#   maxDelay: 5m
#   # Exponential determines whether delay is treated as exponential.
#   # With 30s delay and 3 retries: 30, 60, 120s
#   # Default is false
#   exponential: false

# Logging

# There are four options for the log output: file, stderr, syslog, eventlog
# The file output is the default.

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#agent.logging.level: info

# Enable debug output for selected components. To enable all selectors use ["*"]
# Other available selectors are "beat", "publish", "service"
# Multiple selectors can be chained.
#agent.logging.selectors: [ ]

# Send all logging output to stderr. The default is false.
agent.logging.to_stderr: true

# Send all logging output to syslog. The default is false.
#agent.logging.to_syslog: false

# Send all logging output to Windows Event Logs. The default is false.
#agent.logging.to_eventlog: false

# If enabled, Elastic-Agent periodically logs its internal metrics that have changed
# in the last period. For each metric that changed, the delta from the value at
# the beginning of the period is logged. Also, the total values for
# all non-zero internal metrics are logged on shutdown. The default is true.
#agent.logging.metrics.enabled: true

# The period after which to log the internal metrics. The default is 30s.
#agent.logging.metrics.period: 30s

# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
#agent.logging.to_files: true
#agent.logging.files:
  # Configure the path where the logs are written. The default is the logs directory
  # under the home path (the binary location).
  #path: /var/log/elastic-agent

  # The name of the files where the logs are written to.
  #name: elastic-agent

  # Configure log file size limit. If limit is reached, log file will be
  # automatically rotated
  #rotateeverybytes: 10485760 # = 10MB

  # Number of rotated log files to keep. Oldest files will be deleted first.
  #keepfiles: 7

  # The permissions mask to apply when rotating log files. The default value is 0600.
  # Must be a valid Unix-style file permissions mask expressed in octal notation.
  #permissions: 0600

  # Enable log file rotation on time intervals in addition to size-based rotation.
  # Intervals must be at least 1s. Values of 1m, 1h, 24h, 7*24h, 30*24h, and 365*24h
  # are boundary-aligned with minutes, hours, days, weeks, months, and years as
  # reported by the local system clock. All other intervals are calculated from the
  # Unix epoch. Defaults to disabled.
  #interval: 0

  # Rotate existing logs on startup rather than appending to the existing
  # file. Defaults to true.
  # rotateonstartup: true

# Set to true to log messages in JSON format.
#agent.logging.json: false

# Set to true, to log messages with minimal required Elastic Common Schema (ECS)
# information. Recommended to use in combination with `logging.json=true`
# Defaults to false.
#agent.logging.ecs: false

fleet.yml

agent:
  id: e941eade-2002-4b2f-a437-1e1bdb316d79

@ph ph added the bug label Aug 25, 2020
@michalpristas
Copy link
Contributor

michalpristas commented Aug 25, 2020

able to repro. first impression is that there is some race in setup,
first i see: {"log.level":"info","@timestamp":"2020-08-25T12:11:24.181Z","log.origin":{"file.name":"application/application.go","file.line":56},"message":"Agent is managed locally","ecs.version":"1.5.0"}

then agent starts metricbeat/filebeat...
but then later configuration changes probably after enroll and fails on missing output section on periodic changes (which should not occur as when managing remotely periodic checker is disabled)

edit:
not sure i see restart after enroll in the test code

@mdelapenya
Copy link
Contributor

mdelapenya commented Aug 25, 2020

As explained in elastic/e2e-testing#236, we found that the order to enable/enroll/start the agent is important.

Closing as this issue, although reproducible, represents an use case not following the official instructions (https://www.elastic.co/guide/en/ingest-management/current/run-elastic-agent.html) on how to enrol the agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Ingest Management:beta2 Group issues for ingest management beta2
Projects
None yet
Development

No branches or pull requests

6 participants