Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when combining processors autodiscover hint with modules using processors #16190

Closed
exekias opened this issue Feb 7, 2020 · 11 comments · Fixed by #16450
Closed

Error when combining processors autodiscover hint with modules using processors #16190

exekias opened this issue Feb 7, 2020 · 11 comments · Fixed by #16450
Assignees
Labels
autodiscovery bug candidate Candidate to be added to the current iteration Filebeat Filebeat libbeat Team:Platforms Label for the Integrations - Platforms team

Comments

@exekias
Copy link
Contributor

exekias commented Feb 7, 2020

It seems autodiscover processors hint produces an error when used with modules that also make use of processors (ie elasticsearch module).

Steps to reproduce it:

Configure Filebeat with hints based autodiscover like this:

filebeat.autodiscover:
  providers:
    - type: docker
      hints.enabled: true

output.console:
  pretty: true

Launch a container with the following labels:

docker run -l 'co.elastic.logs/processors.add_fields={\"foo\":\"bar\"}' -l co.elastic.logs/module=elasticsearch busybox echo hell

This error happens:

Error creating input: each processor must have exactly one action, but found 2 actions (add_locale,add_fields)

The expected behavior is to get the module launched, with both processors working for it.

@blakerouse
Copy link
Contributor

Found the issue is in go-ucfg in how it handles merging configurations. elastic/go-ucfg#150

@blakerouse
Copy link
Contributor

Actually looking at it, its not an issue in go-ucfg. Its an issue of the code path relying on the default behavior in go-ucfg when really ucfg.AppendValues should be used.

@blakerouse
Copy link
Contributor

Actually changing to just ucfg.AppendValues in the location where the merge is occurring has a side-effect of not being able to select the processor field explicitly for this behavior. Without being able to select the processor field explicitly means that the change could affect other fields where the default or merge is the expected behavior.

Autodiscover is doing just blind merging of configuration without using a structure to provide context on how the merging should be performed. Example the following structure would fix the bug (only showing the case for processor field).

struct {
    Processors `config:",append"`
}

Looking into how hard it would be to change Autodiscover to use structs in cases where the merging behavior needs to be adjusted.

@exekias
Copy link
Contributor Author

exekias commented Feb 13, 2020

I guess this could be a mix of well known fields where we decide how merge happens and a default behavior for the rest?

@urso
Copy link

urso commented Feb 13, 2020

I've never tried if this works, but I think we might be able to provide some generic structs allowing us to control merging rules like. In case it doesn't work I'd say this is a bug in go-ucfg. For example we could create definitions like this:

type Merger interface {
  Apply(*common.Config) error
}

// mergeAppend ensures that array elements are appended to existing arrays
type addAppend struct {
  fields common.MapStr `config:",inline,append"`
}

// mergeReplace replaces dictionaries and arrays completely
type addReplace struct {
  fields common.MapStr `config:",inline,replace"`
}

// addMerge merges dictionaries and arrays (default behavior of go-ucfg)
type addMerge struct {
  fields common.MapStr `config:",inline,merge"`
}

func AppendFields(fields common.MapStr) Merger { ... }
func ReplaceFields(fields common.MapStr) Merger { ... }
func MergeFields(fields common.MapStr) Merger { ... }
func All(mergers ...Merger) Merger { ... }

func (a *addAppend) Apply(cfg *common.Config) error { ... }
func (a *addReplace) Apply(cfg *common.Config) error { ... }
func (a *addMerge) Apply(cfg *common.Config) error { ... }

At least for hints based auto-discovery we can setup one 'merger' per hint and then apply one after another. This allows us to define the merge behavior per hint. E.g. for processors we might prefer append, but for hosts or paths we would prefer Replace.

@blakerouse
Copy link
Contributor

blakerouse commented Feb 13, 2020

Yeah the more I dig into this, more things are wrong. Example is paths should really be set to config:",replace". Because the below generated config shows that its wrong keeping the *_deprecation.json in paths, when it really should only be the container path present.

mergedCfg: map[exclude_files:[.gz$ _slowlog.log$ _access.log$] multiline:map[match:after negate:true pattern:^(\[[0-9]{4}-[0-9]{2}-[0-9]{2}|{)] path:map[config:/Users/blake/Code/elastic/beats/src/github.com/elastic/beats/filebeat data:/Users/blake/Code/elastic/beats/src/github.com/elastic/beats/filebeat/data home:/Users/blake/Code/elastic/beats/src/github.com/elastic/beats/filebeat logs:/Users/blake/Code/elastic/beats/src/github.com/elastic/beats/filebeat/logs] paths:[/var/lib/docker/containers/02e657656ea4fccc540d5ca0dca4a4f926165aec6c5548b473a9c10348e3b115/*-json.log /usr/local/var/lib/elasticsearch/*_deprecation.json] processors:[map[add_fields:{\"foo\":\"bar\"} add_locale:map[when:map[not:map[regexp:map[message:^{]]]]]] stream:all type:container]

@blakerouse
Copy link
Contributor

blakerouse commented Feb 13, 2020

@urso Whats your thoughts on extending go-ucfg to adjust config handling options per field in Config.Merge?

Example:

cfg.Merge(other, ucfg.FieldAppendValues("processors"), ucfg.FieldReplaceValues("paths"))

Could be extended even to do nested fields:

cfg.Merge(other, ufg.FieldAppendValues("nested.processors"))

Another option is just to do this in beats with something like:

type filesetMergeProfile struct {
    Paths `config:",replace"`
    Processors `config:",append"`
}

var profile filesetMergeProfile
cfg.Unpack(&profile)
other.Unpack(&profile)
cfg.Merge(other)
cfg.Merge(profile, ucfg.ReplaceValues)

@blakerouse
Copy link
Contributor

Tried the second option as I thought it was the simplest but it does not work because the final cfg.Merge(profile, ucfg.ReplaceValues) will replace the entire config from the one in profile.

Probably the first approach is more appropriate as its not hacking the usage of Unpack to perform the merge.

@urso
Copy link

urso commented Feb 14, 2020

Option 1 would be feasible to do in go-ucfg. Yet we would introduce more magic strings in Beats, hoping that we did catch every single use-case. I'd prefer if we would not have config naming all over the place, but this might reqiure us to introduce a schema long term.

Instead of:

cfg.Merge(other, ucfg.FieldAppendValues("processors"), ucfg.FieldReplaceValues("paths"))

it could be:

cfg.Merge(other, ucfg.WithFieldMergeType(func (field string) ucfg.MergeType {
  ...
})

Not sure we want/need full path matching. When merging we might not always know the 'level' of the setting. E.g. if an array of configs are given, then the path setting would becomes 0.path and 1.path. Supporting this use-case cfg.Merge(other, ufg.FieldAppendValues("nested.processors")), we might want to have some kind of glob-pattern support, so we can write cfg.Merge(other, ufg.FieldAppendValues("**.processors")).

@blakerouse
Copy link
Contributor

blakerouse commented Feb 14, 2020

The idea of ucfg.WithFieldMergeType is the func would be called for every field? Then the function could make a decision on how to merge? That might give lots of flexibility, but I think the function would need to know what the depth of that field is as well, so it would need to probably be the full path (eg. processors.0.add_fields).

The array of configs case would be handled by *.path.

The idea for glob pattern is good, I was only thinking that * was to represent arrays. Based on current code that I have implemented the follow is what is required for merging of paths and processors in the fileset module.

cfg.Merge(
    other,
    ucfg.FieldReplaceValues("paths"), ucfg.FieldDefaultValues("paths.*"),
    ucfg.FieldAppendValues("processors"), ucfg.FieldDefaultValues("processors.*"))

@blakerouse
Copy link
Contributor

blakerouse commented Feb 14, 2020

Better yet if we go with the ucfg.WithFieldMergeType path it could be a structure that we pass to the function to give all the context it might need.

type FieldInfoType int

const (
    FieldInfoTypePrimitive FieldInfoType = iota
    FieldInfoTypeMap
    FieldInfoTypeArray
)

type FieldInfo struct {
    Name    string
    Type      FieldInfoType
    Depth    int
    Parents []FieldInfo
}

I feel like I am still leaning towards magic strings as all the logic could really be distilled down into them simply. We could add the function way as another possible path, if we need even more complexity (but that is not know).

Questionable if we even need the FieldInfoTypePrimitive because you cannot really change the merge behavior of a primitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autodiscovery bug candidate Candidate to be added to the current iteration Filebeat Filebeat libbeat Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants