Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: event factory support #13017

Merged
merged 17 commits into from
Jun 28, 2021
Merged

Feat: event factory support #13017

merged 17 commits into from
Jun 28, 2021

Conversation

kares
Copy link
Contributor

@kares kares commented Jun 22, 2021

Introduce a new_event (factory) interface for the Ruby plugin API.

The scope of this PR is to introduce the EventFactory interface with a single method EventFactory#new_event and two implementations (perhaps BasicEventFactory and TargetedEventFactory), and to effectively add three methods to the API of LogStash::Plugin:

  • Plugin#event_factory: returns a memoized EventFactory instance, that can be used to create events
  • Plugin#targeted_event_factory: returns a memoized EventFactory that respects the plugin's target configuration.
    • if no target is provided, this effectively is equivalent to Plugin#event_factory.
    • if a target is provided, returns a TargetedEventFactory that wraps the return value of Plugin#event_factory.
  • Plugin#create_event_factory: returns a new EventFactory based on the plugin's configuration and state. We may want to mark this as an "internal" or "advanced" API.

NOTES (from @yaauie)

  • We should document these methods as effectively-final, warning plugin developers from overriding them. There are a lot of possible sharp edges here, and any overriding will create a tight coupling that eventually prevents us from moving forward.
  • The Plugin#create_event_factory can probably be a private part of the API. I see no reason why a plugin would need to provide its own implementation, and doing so would prevent us from changing the internals in logstash core to support configuration-driven alternate event factories (such as factories that produce copy-on-write events, alternate field-reference-syntax events, etc.).
  • we should memoize to underscore-prefixed ivars in a concurrency-safe manner and document the ivar as an implementation detail that should not be relied on (such as @_event_factory and @_targeted_event_factory).

Release notes

[rn:skip]

What does this PR do?

This PR is motivated by having a way of re-using LogStash::Event.from_json with target => [namespace] (e.g. in the JSON codec). We benchmarked the generic LogStash::Json parsing (which has JRuby parsing hooks) and found it noticeably slower compared to the Java native Event.fromJson.

A clean way forward would be to introduce a notion of an EventFactory with a newEvent (new_event) interface.
The factory is meant to be later re-used/extended with ECS compatibility (new_event behaving slightly differently in legacy vs ecs mode), this is out-of-scope here, still something to keep in mind.

Why is it important/What is the impact to the user?

It impacts plugin development which gets simpler as we'll provide a recommended way to create new events.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

Use cases

Sample use-case (using the adapter + from_json helper) in the json codec.

meta-issue: #13021

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes from my review of your draft, in line.

Before this gets merged, I'd also like to see a draft of our EventFactorySupport adapter in-flight to ensure that we can expose an approximation of this functionality to updated plugins running on older Logstashes.

A possible sharp edge I can see is that if for example our JSON codec provides a block "event factory" to an older Logstash that doesn't acknowledge the block, will the block simply be ignored? Most of our support adapters target the Plugin API (and can thus be scoped to the plugin that includes them), but this requires a change to the Event API. Perhaps we can solve this by providing version-based fallback behaviour in the JSON codec. Or by using ruby refinements on Event?


module LogStash
module Util
module ThreadSafeAttributes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 clean reusable implementation, really cleans up the details of using them.

* A factory for events.
*/
@FunctionalInterface
public interface EventFactory {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new interface very nearly is implemented by the co.elastic.logstash.api.EventFactory that is available to java-based plugins through co.elastic.logstash.api.Context which is provided to the constructor. Can we converge without breaking consumers of co.elastic.logstash.api.EventFactory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaauie 💯 thanks I completely missed the existing EventFactory - for sure, no reason to re-invent the wheel!
the only down-side is org.logstash.Event.fromJson assumes the concrete implementation being returned, thus we'll assume and cast the returned events.

@kares
Copy link
Contributor Author

kares commented Jun 23, 2021

Before this gets merged, I'd also like to see a draft of our EventFactorySupport adapter in-flight to ensure that we can expose an approximation of this functionality to updated plugins running on older Logstashes.

here's the adapter falling back on LS that does not defined? LogStash::Plugins::EventFactorySupport

A possible sharp edge I can see is that if for example our JSON codec provides a block "event factory" to an older Logstash that doesn't acknowledge the block, will the block simply be ignored?

correct, the block is passed to the method implicitly for an old LS (and is never yielded to) :

>> LOGSTASH_VERSION
=> "7.13.0"
>> LogStash::Event.from_json('{ "a": "b" }') { |data| raise "unexpected" }
=> [#<LogStash::Event:0x34fb5c3f>]

... alt ought should not matter since we'll never expect to do a block form on LS where the feature is not supported

Most of our support adapters target the Plugin API (and can thus be scoped to the plugin that includes them), but this requires a change to the Event API. Perhaps we can solve this by providing version-based fallback behaviour in the JSON codec. Or by using ruby refinements on Event?

the approach would be we wrap from_json {} block form and provide a new method to be consumed by the codecs that are going to need Event.from_json(json) { |data| event_factory.new_event(data) } this way we pretty much only have a single place where the block form is known (thus the idea to not make it official API yet but I know that it gets semi-official anyway 😜) - details are part of the event_support mixin PR to provide an opt-in FromJsonHelper module.

@kares kares mentioned this pull request Jun 23, 2021
4 tasks
Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic, especially along-side the support adapter.

Do we want the block form to be the only way to use the event factory with ruby's Event#from_json?

  • Would it make sense to also support Event#from_json(payload, event_factory)?
  • Or can we count on the & to-proc magic of Event#from_json(payload, &event_factory) working?

I have left a few comments, but have no major issues remaining and am confident that we can get these through in short order.

return new org.logstash.Event(data);
}
};
return BasicEventFactory.INSTANCE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why adding an EventFactory getTargetedEventFactory() for the Java API is out of scope here -- the context doesn't have the plugin's config, and we don't have standardized ways of defining plugin options.

A native plugin could also wrap the EventFactory that they retrieve from here, so it isn't that big of a deal anyway.

But we should file a follow-up task to make sure that we aren't leaving the Java API behind.

@kares
Copy link
Contributor Author

kares commented Jun 23, 2021

Do we want the block form to be the only way to use the event factory with ruby's Event#from_json?

My thinking was let's do the block form as it's more flexible + avoids (more) confusion about the dual nature of things - Java Event vs Ruby Event, Java EventFactory vs Ruby factories.

  • Would it make sense to also support Event#from_json(payload, event_factory)?

It would if the manual RubyEvent wrapped for the org.logstash.Event wasn't "invented" and we relied on JRuby's Java integration. Which is something we can attempt in the future ...

For now, since the runtime does not know RubyEvent is just a wrapper for the Java Event, we would still need to play tricks to wrap the event_factory to be able to pass it as a Java EventFactory (doing RubyEvent unwrapping and re-wrapping like currently in the code with the block-form).

  • Or can we count on the & to-proc magic of Event#from_json(payload, &event_factory) working?

As explained above we unfortunately can not atm, if there was only a Java EventFactory and RubyEvent = org.logstash.Event than Event#from_json would auto-magically work with a block as an implementation for the functional interface.

If we ever bridge the gap between 2 types of events than the current block form doing { event_factory.new_event } would continue working as a Java EventFactory implementation and the next step from there would be to close the gap with dropping the Ruby factories in favor of the Java ones. This would be part of the reasoning to prefer a block instead of the second event_factory argument.


# @api private
# @since LS 7.14
def create_event_factory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we drop this method for now? to align with the support mixin

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's private, so not necessarily something we need to align on. It is also a good hook point for future efforts in LS core that don't need backport support (things like alternate Event implementation experiments like copy-on-write or field reference parser changes), so I expect even if we remove it now we will add it back in short order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, sticking around just wanted to double check given what private means in Ruby land 😉

@kares kares marked this pull request as ready for review June 24, 2021 10:39
Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@kares kares merged commit 68c7534 into elastic:master Jun 28, 2021
kares added a commit to kares/logstash that referenced this pull request Jun 28, 2021
Introduce a `new_event` (factory) interface for the Ruby plugin API.

Co-authored-by: Ry Biesemeyer <[email protected]>

+ Refactor: keep (useful) causes mapping to Ruby errors

+ Refactor: avoid trim-ing (large) json strings

+ Feat: a synchronize(object) {} helper for Ruby

+ Feat: a (thread-safe) lazy_init_attr {} helper
kares added a commit that referenced this pull request Jun 28, 2021
Introduce a `new_event` (factory) interface for the Ruby plugin API.

Co-authored-by: Ry Biesemeyer <[email protected]>

+ Refactor: keep (useful) causes mapping to Ruby errors

+ Refactor: avoid trim-ing (large) json strings

+ Feat: a synchronize(object) {} helper for Ruby

+ Feat: a (thread-safe) lazy_init_attr {} helper
kares added a commit to logstash-plugins/logstash-mixin-event_support that referenced this pull request Jun 29, 2021
An adapter for `event_factory` support (available in LS since 7.14 elastic/logstash#13017).

Also provides a compatibility layer for `Event.from_json(json) { |data| ... }` by having a `events_from_json` helper.
To be used by plugins such as json codec to implement `target => [here]` support.

Co-authored-by: Ry Biesemeyer <[email protected]>
kares added a commit to kares/logstash that referenced this pull request Jul 1, 2021
* master: (41 commits)
  Test: resolve integration failure due ECS mode (elastic#13044)
  Feat: event factory support (elastic#13017)
  Doc: Add geoip database API to node stats (elastic#13019)
  Add geoip database metrics to /node/stats API (elastic#13004)
  ecs: on-by-default plus docs (elastic#12830)
  ispec: fix cross-spec leak from fatal error integration specs (elastic#13002)
  Fix UBI source URL (elastic#13008)
  update fpm to allow pkg creation on jdk11+jruby 9.2 (elastic#13005)
  Add unit test to grant that production aliases correspond to a published RubyGem (elastic#12993)
  Fix logstash.bat not setting exit code (elastic#12948)
  Use the OS separator to invoke gradlew from Rake script (elastic#13000)
  Allow per-pipeline config of ECS Compatibility mode via Central Management (elastic#12861)
  Update jinja2 dependency in docker build (elastic#12994)
  fix database manager with multiple pipelines (elastic#12862)
  Fix Reflections stack traces when process yml files in classpath and debug is enabled (elastic#12991)
  Fix/log4j routing to avoid create spurious file (elastic#12965)
  Deps: update JRuby to 9.2.19.0 (elastic#12989)
  Doc: Add tip for checking for existing field (elastic#12899)
  Added test to cover the installation of aliased plugins (elastic#12967)
  CI: Update logstash_release.json after 7.3.12 (elastic#12986)
  ...
@yaauie yaauie mentioned this pull request Jul 22, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants