Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Azure Signin Module authentication_processing_details Issue #34330

Closed
mr1716 opened this issue Jan 20, 2023 · 18 comments · Fixed by #34478
Closed

[Filebeat] Azure Signin Module authentication_processing_details Issue #34330

mr1716 opened this issue Jan 20, 2023 · 18 comments · Fixed by #34478
Assignees
Labels
Team:Cloud-Monitoring Label for the Cloud Monitoring team

Comments

@mr1716
Copy link
Contributor

mr1716 commented Jan 20, 2023

The log is found in the official Elastic Repository at: https://github.com/elastic/beats/blob/main/x-pack/filebeat/module/azure/signinlogs/test/test-non-interactive-user-signin.log-expected.json

For the Azure Signin Module, the following field should have the periods in "Legacy TLS (TLS 1.0, 1.1, 3DES)" replaced with an underscore or another value. When the value is unflattened, Filebeat views the periods as subfields, which is not the intent.
"azure.signinlogs.properties.authentication_processing_details.Legacy TLS (TLS 1.0, 1.1, 3DES)": "False",

It should be something like:
"azure.signinlogs.properties.authentication_processing_details.Legacy TLS (TLS 1_0, 1_1, 3DES)": "False",

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 20, 2023
@mr1716 mr1716 changed the title Filebeat: Azure Signin Module authentication_processing_details Issue [Filebeat] Azure Signin Module authentication_processing_details Issue Jan 20, 2023
@tetianakravchenko tetianakravchenko added the Team:Cloud-Monitoring Label for the Cloud Monitoring team label Jan 24, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 24, 2023
@tetianakravchenko
Copy link
Contributor

@elastic/obs-cloud-monitoring fyi

@zmoog zmoog self-assigned this Jan 24, 2023
@UcanInfosec
Copy link

@zmoog curious to see what is found and if changes are required. This only happens when there’s a period in that last field value key

@zmoog
Copy link
Contributor

zmoog commented Jan 26, 2023

Let me run some tests to understand how these kinds of logs are ingested and indexed.

@zmoog
Copy link
Contributor

zmoog commented Jan 26, 2023

I tried to ingest the test document test-non-interactive-user-signin.log you @mr1716 mentioned using Filebeat 8.6.1.

With an input document containing this snippet:

{
  "authenticationProcessingDetails": [
    {
      "key": "Legacy TLS (TLS 1.0, 1.1, 3DES)",
      "value": "False"
    },
    {
      "key": "Oauth Scope Info",
      "value": "[User.Read,Userinfo.ReadWrite]"
    },
    {
      "key": "Is CAE Token",
      "value": "False"
    }
  ]
}

The end result is the following:

{
  "authentication_processing_details": {
    "Legacy TLS (TLS 1": {
      "0, 1": {
        "1, 3DES)": "False"
      }
    },
    "Oauth Scope Info": "[User.Read,Userinfo.ReadWrite]",
    "Is CAE Token": "False"
  }
}

There is room for improvement.

@zmoog
Copy link
Contributor

zmoog commented Jan 26, 2023

TIL the set processor can turn a field {"a.b": true} into {"a": {"b": true}}.

I am trying to replace the set processor in the foreach with a script processor.

Starting from:

{
  "authenticationProcessingDetails": [
    {
      "key": "Legacy TLS (TLS 1.0, 1.1, 3DES)",
      "value": "False"
    },
    {
      "key": "Oauth Scope Info",
      "value": "[User.Read,Userinfo.ReadWrite]"
    },
    {
      "key": "Is CAE Token",
      "value": "False"
    }
  ]
}

if I use this script inside the foreach processor:

{
  "script": {
    "lang": "painless",
    "source": """
      def tmp = [:];
      for (item in ctx.azure.signinlogs.properties.authentication_processing_details) {
        tmp[item.key] = item.value;
      }
      ctx.azure.signinlogs.properties.authentication_processing_details = tmp;
    """
  }
}

I get the following result:

{
  "authentication_processing_details": {
    "Oauth Scope Info": "[User.Read,Userinfo.ReadWrite]",
    "Legacy TLS (TLS 1.0, 1.1, 3DES)": "False",
    "Is CAE Token": "False"
  }
}

@UcanInfosec
Copy link

@zmoog any way to do that in JavaScript?

@zmoog
Copy link
Contributor

zmoog commented Jan 27, 2023

@UcanInfosec, according to the Available scripting language, unfortunately JavaScript is not one of them.

The most interesting and used option is the Painless scripting language. It is a good option to write down expressions or small snippets of code to transform data.

@zmoog
Copy link
Contributor

zmoog commented Jan 27, 2023

@mr1716 @UcanInfosec does the new document structure work for your use cases?

{
  "authentication_processing_details": {
    "Oauth Scope Info": "[User.Read,Userinfo.ReadWrite]",
    "Legacy TLS (TLS 1.0, 1.1, 3DES)": "False",
    "Is CAE Token": "False"
  }
}

I am creating a quick PR to gather feedback from other team members.

@mr1716
Copy link
Contributor Author

mr1716 commented Jan 27, 2023

@zmoog I think the question that @UcanInfosec had was how would we do this same thing if it were done in native Filebeat, not Elastic Painless. The new structure works good. Just curious how to do what you did in native filebeat

@zmoog
Copy link
Contributor

zmoog commented Jan 27, 2023

@mr1716, what do you mean by 'native Filebeat'?

Here's how Filebeat collects the logs from Azure and publishes them to Elasticsearch.

┌───────────────────────────────────────┐     ┌─────────────────────────┐      ┌────────────────────────┐
│                                       │     │                         │      │                        │
│                                       │     │                         │      │                        │
│                                       │     │                         │      │                        │
│  ┌─────────────┐   ┌────────────────┐ │     │   ┌─────────────────┐   │      │  ┌───────────────────┐ │
│  │ diagnostic  │   │  appservices   │ │     │   │ azure-eventhub  │   │      │  │     eventhub      │ │
│  │   setting   │──▶│ <<event hub>>  │─┼amqp─┼──▶│    <<input>>    │───┼─http─┼─▶│  <<data stream>>  │ │
│  └─────────────┘   └────────────────┘ │     │   └─────────────────┘   │      │  └───────────────────┘ │
│                                       │     │                         │      │                        │
│                                       │     │                         │      │                        │
│                                       │     │                         │      │                        │
│                                       │     │                         │      │                        │
└─Azure─────────────────────────────────┘     └───Filebeat──────────────┘      └──Elasticsearch─────────┘

The azure-eventhub input connects to the Azure event hub and fetches the logs. The input sends the logs to the data stream, where an ingest pipeline processes them before the indexing.

The azure-eventhub input does not process the logs. It is the Filebeat adapter to access the event hub. All the processing and data transformation from the source format from Azure to the document in Elasticsearch happens in the ingest pipeline.

@UcanInfosec
Copy link

@zmoog how would the painless script be converted into a filebeat script processor?

@UcanInfosec
Copy link

Because it’s fine when there are spaces but not periods originally

@zmoog
Copy link
Contributor

zmoog commented Jan 31, 2023

@zmoog how would the painless script be converted into a filebeat script processor?

For existing Filebeat modules and integrations, the processors are defined as YAML files and created in Elasticsearch during installation.

For example, here are the source for the sign-in logs ingest pipelines:

If you want to add a script processor to a new or existing pipeline, you can do it in Kibana or Dev Tools.

For example, in Kibana, you can:

  • visit Stack Management > Ingest Pipelines
  • pick an existing pipeline or click on the 'Create pipeline' button

I you want to get started with the Painless scripting language, then Painless scripting language is a good starting point.

@zmoog
Copy link
Contributor

zmoog commented Jan 31, 2023

Because it’s fine when there are spaces but not periods originally

The set processor expands the dots in field names into subfields.

For example, given the following pipeline in the Dev Tools:

PUT _ingest/pipeline/zmoog-test
{
  "processors": [
    {
      "set": {
        "field": "a.b.c",
        "value": true
      }
    }
  ]
}

If we simulate the pipeline execution using the empty test object {}:

POST _ingest/pipeline/zmoog-test/_simulate
{
  "docs": [
    {
      "_source": {}
    }
  ]
}

I get the following result:

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_id": "_id",
        "_version": "-3",
        "_source": {
          "a": {
            "b": {
              "c": true
            }
          }
        },
        "_ingest": {
          "timestamp": "2023-01-31T21:20:48.352056732Z"
        }
      }
    }
  ]
}

The set processor turns the {"a.b.c": True} into this one:

{
  "a": {
    "b": {
      "c": true
    }
  }
}

We proposed the fix https://github.com/elastic/integrations/pull/5129/files#diff-274e12d0961404cd66da857b49259663133debe54e6cc0b9e0832114450785c8 that replaces for foreach + set processors with one script processor running a short Painless script to avoid the expansion of the dots.

If it works for you, we'll port the fix from the integrations repo to beats.

@UcanInfosec
Copy link

As long as it fixes this, let’s do it. This needs to get fixed

@zmoog
Copy link
Contributor

zmoog commented Feb 3, 2023

Great. I am finalizing the two PRs for both Beats and Elastic Agent integration.

@zmoog
Copy link
Contributor

zmoog commented Feb 3, 2023

Recap:

  • Expanding the dots in the field name into sub-fields of authentication_processing_details is a side-effect.
  • To avoid this side-effect, we are replacing the foreach + set processors combo with a script processor.

Fixes implemented in:

@zmoog
Copy link
Contributor

zmoog commented Feb 3, 2023

Targeting to ship this fix in the following Beats releases:

  • 8.7
  • 8.6.2
  • 7.17.10

And in the Azure Logs integration 1.5.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Cloud-Monitoring Label for the Cloud Monitoring team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants