Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APM Agents configuration cache refresh appears to be broken #13957

Closed
up2neck opened this issue Aug 30, 2024 · 3 comments · Fixed by #13958
Closed

APM Agents configuration cache refresh appears to be broken #13957

up2neck opened this issue Aug 30, 2024 · 3 comments · Fixed by #13958
Labels

Comments

@up2neck
Copy link
Contributor

up2neck commented Aug 30, 2024

APM Server version (apm-server version): 8.14.3 (linux/amd64), standalone

Description of the problem including expected versus actual behavior:

APM Server makes invalid search/scroll requests to ".apm-agent-configuration" Elasticsearch index:

Image

Image

Image

Invalid transaction JSON
{
  "@timestamp": [
    "2024-08-30T09:11:00.105Z"
  ],
  "agent.name": [
    "go"
  ],
  "agent.version": [
    "2.6.0"
  ],
  "cloud.availability_zone": [
    ""
  ],
  "cloud.instance.id": [
    ""
  ],
  "cloud.instance.name": [
    ""
  ],
  "cloud.project.id": [
    ""
  ],
  "cloud.provider": [
    ""
  ],
  "cloud.region": [
    ""
  ],
  "data_stream.dataset": [
    "apm"
  ],
  "data_stream.namespace": [
    "default"
  ],
  "data_stream.type": [
    "traces"
  ],
  "destination.address": [
    "elasticsearch-v1-es-http.apm-sandbox.svc"
  ],
  "destination.port": [
    9200
  ],
  "event.agent_id_status": [
    "missing"
  ],
  "event.ingested": [
    "2024-08-30T09:11:01.000Z"
  ],
  "event.outcome": [
    "failure"
  ],
  "event.success_count": [
    0
  ],
  "host.architecture": [
    "amd64"
  ],
  "host.hostname": [
    "apm-server-v1-apm-server-68576fc56-6pkms"
  ],
  "host.name": [
    "apm-server-v1-apm-server-68576fc56-6pkms"
  ],
  "host.os.platform": [
    "linux"
  ],
  "http.response.status_code": [
    400
  ],
  "labels.project": [
    "epm-paas"
  ],
  "observer.hostname": [
    "apm-server-v1-apm-server-68576fc56-6pkms"
  ],
  "observer.type": [
    "apm-server"
  ],
  "observer.version": [
    "8.14.3"
  ],
  "parent.id": [
    "5ae92913235dc969"
  ],
  "process.args": [
    "apm-server",
    "run",
    "-e",
    "-c",
    "config/config-secret/apm-server.yml"
  ],
  "process.pid": [
    1
  ],
  "process.title": [
    "apm-server"
  ],
  "process.title.text": [
    "apm-server"
  ],
  "processor.event": [
    "span"
  ],
  "service.environment": [
    "sandbox"
  ],
  "service.language.name": [
    "go"
  ],
  "service.language.version": [
    "go1.22.5"
  ],
  "service.name": [
    "apm-server"
  ],
  "service.node.name": [
    "apm-server-v1-apm-server-68576fc56-6pkms"
  ],
  "service.runtime.name": [
    "gc"
  ],
  "service.runtime.version": [
    "go1.22.5"
  ],
  "service.target.type": [
    "elasticsearch"
  ],
  "service.version": [
    "8.14.3"
  ],
  "span.db.type": [
    "elasticsearch"
  ],
  "span.db.user.name": [
    "apm-sandbox-apm-server-v1-apm-user"
  ],
  "span.destination.service.name": [
    "elasticsearch"
  ],
  "span.destination.service.resource": [
    "elasticsearch"
  ],
  "span.destination.service.type": [
    "db"
  ],
  "span.duration.us": [
    82067
  ],
  "span.id": [
    "6c3c47e4d07a1c9a"
  ],
  "span.name": [
    "Elasticsearch: POST _search/scroll"
  ],
  "span.representative_count": [
    1
  ],
  "span.stacktrace": [
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "span.go",
      "line": {
        "number": 442
      },
      "function": "(*Span).End",
      "module": "go.elastic.co/apm/v2"
    },
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "client.go",
      "line": {
        "number": 161
      },
      "function": "(*responseBody).endSpan",
      "module": "go.elastic.co/apm/module/apmelasticsearch/v2"
    },
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "client.go",
      "line": {
        "number": 153
      },
      "function": "(*responseBody).Read",
      "module": "go.elastic.co/apm/module/apmelasticsearch/v2"
    },
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "stream.go",
      "line": {
        "number": 165
      },
      "function": "(*Decoder).refill",
      "module": "encoding/json"
    },
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "stream.go",
      "line": {
        "number": 140
      },
      "function": "(*Decoder).readValue",
      "module": "encoding/json"
    },
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "stream.go",
      "line": {
        "number": 63
      },
      "function": "(*Decoder).Decode",
      "module": "encoding/json"
    },
    {
      "exclude_from_grouping": false,
      "filename": "elasticsearch.go",
      "line": {
        "number": 288
      },
      "function": "(*ElasticsearchFetcher).singlePageRefresh",
      "module": "github.com/elastic/apm-server/internal/agentcfg"
    },
    {
      "exclude_from_grouping": false,
      "filename": "elasticsearch.go",
      "line": {
        "number": 223
      },
      "function": "(*ElasticsearchFetcher).refreshCache",
      "module": "github.com/elastic/apm-server/internal/agentcfg"
    },
    {
      "exclude_from_grouping": false,
      "filename": "elasticsearch.go",
      "line": {
        "number": 137
      },
      "function": "(*ElasticsearchFetcher).Run.func1",
      "module": "github.com/elastic/apm-server/internal/agentcfg"
    },
    {
      "exclude_from_grouping": false,
      "filename": "elasticsearch.go",
      "line": {
        "number": 179
      },
      "function": "(*ElasticsearchFetcher).Run",
      "module": "github.com/elastic/apm-server/internal/agentcfg"
    },
    {
      "exclude_from_grouping": false,
      "filename": "beater.go",
      "line": {
        "number": 452
      },
      "function": "(*Runner).Run.func8",
      "module": "github.com/elastic/apm-server/internal/beater"
    },
    {
      "exclude_from_grouping": false,
      "filename": "errgroup.go",
      "line": {
        "number": 78
      },
      "function": "(*Group).Go.func1",
      "module": "golang.org/x/sync/errgroup"
    },
    {
      "exclude_from_grouping": false,
      "library_frame": true,
      "filename": "asm_amd64.s",
      "line": {
        "number": 1695
      },
      "function": "goexit",
      "module": "runtime"
    }
  ],
  "span.subtype": [
    "elasticsearch"
  ],
  "span.type": [
    "db"
  ],
  "timestamp.us": [
    1725009060105160
  ],
  "trace.id": [
    "9dc612c94d0058a25ab50e62d45e430c"
  ],
  "transaction.id": [
    "9dc612c94d0058a2"
  ],
  "url.original": [
    "https://elasticsearch-v1-es-http.apm-sandbox.svc:9200/_search/scroll?scroll=30000ms"
  ],
  "url.original.text": [
    "https://elasticsearch-v1-es-http.apm-sandbox.svc:9200/_search/scroll?scroll=30000ms"
  ],
  "_id": "8BaNopEBbRHRTkfJaBr-",
  "_index": ".ds-traces-apm-default-2024.08.24-000002",
  "_score": 15.300476
}

Steps to reproduce:

  1. Install standalone APM Server
  2. Enable instrumentation for APM Server
  3. Check APM Server transactions to Elasticsearch

Provide logs (if relevant):

@up2neck up2neck added the bug label Aug 30, 2024
@up2neck up2neck changed the title APM Agents configuration cache seems to be broken APM Agents configuration cache refresh appears to be broken Aug 30, 2024
@carsonip
Copy link
Member

@up2neck thanks for reporting the issue. It is unexpected that Elasticsearch scroll API returns HTTP 400. It will be helpful if you could attach apm-server logs, especially the parts with "log.logger":"agentcfg". Please remove any sensitive information from the logs.

@up2neck
Copy link
Contributor Author

up2neck commented Aug 30, 2024

@carsonip
I've spotted this issue only with APM self-instrumentation. Logs only contain info messages, like this:

    "log.logger": "agentcfg",
    "ecs.version": "1.6.0",
    "@timestamp": "2024-08-29T12:23:45.347Z",
    "message": "Cache creation with expiration 30s.",
    "service.name": "apm-server",
    "log.level": "info",
    "log.origin": {
      "file.name": "agentcfg/cache.go",
      "function": "github.com/elastic/apm-server/internal/agentcfg.newCache",
      "file.line": 38
    }

@carsonip
Copy link
Member

carsonip commented Aug 30, 2024

I see that you've created a bugfix PR #13958 and it makes complete sense. Due to this bug, the scroll search payload is {} as the field isn't even emitted because it is empty, and that causes ES to return 400. There is no need for apm-server logs for investigation.

edit: it is HTTP 400 because scroll_id is missing from both query param and body. The go-elasticsearch client sends scroll id in query param, not body.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants