-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: event factory support #12928
Feat: event factory support #12928
Conversation
@@ -234,21 +237,30 @@ public IRubyObject ruby_to_json(ThreadContext context, IRubyObject[] args) | |||
try { | |||
return RubyString.newString(context.runtime, event.toJson()); | |||
} catch (Exception e) { | |||
throw RaiseException.from(context.runtime, RubyUtil.GENERATOR_ERROR, e.getMessage()); | |||
throw toRubyError(context, RubyUtil.GENERATOR_ERROR, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added these as I wanted to see (have) the Java causes
} | ||
} | ||
|
||
// @param value [String] the json string. A json object/map will newFromRubyArray to an array containing a single Event. | ||
// and a json array will newFromRubyArray each element into individual Event | ||
// @return Array<Event> array of events | ||
@JRubyMethod(name = "from_json", required = 1, meta = true) | ||
public static IRubyObject ruby_from_json(ThreadContext context, IRubyObject recv, RubyString value) | ||
{ | ||
public static IRubyObject ruby_from_json(ThreadContext context, IRubyObject recv, RubyString value, final Block block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly different that the Java fromJson(final String json, final EventFactory factory)
signature.
the reason for that is the Ruby vs Java event duality, this slightly resembles the Java Event API (the EventFactory being a functional interface could be implemented as a block) while doing the Java wrapping/unwrapping seamlessly (as we start with EventFactory
like implementations in Ruby).
HINT: was wondering for a while now why LogStash::Event
isn't just a org.logstash.Event
to reduce the difference between the Java and Ruby API (something to consider for the 8.0 future) - if this was considered while designing the Java event API (conceptually - the naming differences can be handled on the Ruby end by extending the Java class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a lot of back-and-forth on this, but IIRC we explicitly avoided having any jruby-related infection of the Java Event API so that consumers of the API would have a much smaller surface area. The Ruby APIs are also really hard to break confidently because we don't have good data on their usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage of the Block and FunctionalInterface for the EventFactory is clever, and in a really nice way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still consider the block-form to be a "hack" around needing to manually wrap returns from a Java EventFactory
.
was trying to think about a potential future here: LogStash::Event = org.logstash.Event
(whether it happens or not) in which case the block from would become (so we should consider it "internal").
The Ruby APIs are also really hard to break confidently because we don't have good data on their usage.
I didn't mean breaking the Ruby API, Java API would have stayed as is and the extra Ruby behavior would be scripted on top org.logstash.Event.class_eval { def get(arg); getField(arg); end }
etc.
From a user's perspective the duality is confusing.
end | ||
|
||
# @return an event factory | ||
def build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
being a builder a named this build
- also wanted the end result to set the new @event_factory
for the plugin.
maybe it's nicer being explicit: self.event_factory = event_factory_builder.with_target(target).build
or having a event_factory_builder.with_target(target).set_event_factory!
instead of build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels subtly infectious and likely to be a source of surprises. I'm not sure we have found the right boundary.
A plugin developer should either be able to know that they are working with a targeted event factory at its call-site, or they should not need to care about it at all. As implemented, we end up in a weird hybrid where the behaviour of event_factory
is dependent on code paths that aren't immediately discoverable.
I think there are two directions we could go with this.
-
Absorb
with_target
intoEventFactory
, making its use explicit all the way throughCan we reasonably make
with_target
a part of theEventFactory
API?Something like the following would give a lot of flexibility in use while not being secretly magical.
class EventFactory ## base "abstract" class to be subclassed? def with_target(target_field_reference) return self if target_field_reference.nil? TargetedEventFactory.new(self, target_field_reference) end end
def register @targeted_event_factory = event_factory.with_target(@target) end def decode(data) events = LogStash::Event.fromJson(data, @targeted_event_factory) end
-
Make
EventFactorySupport#event_factory
always automatically respect a provided@target
. This feels even more magical and not quite fully thought through, but perhaps closer to a useful and maintainable boundary.module EventFactorySupport def event_factory @event_factory ||= begin # TODO: threadsafety? ef = default_event_factory ef = TargetedEventFactory.new(ef, @target) unless @target.nil? ef end end end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way I was trying to approach this is as if we decided to later rely only on a Java EventFactory
API,
- having
EventFactory#withTarget
on the interface feels unfortunate, since the interface would be tight to a "targeted" factory implementation - a
EventFactoryBuilder
of some kind might be necessary (later) if we decide on the explicit approach - instead of the@event_factory
magically resolving itself based ontarget
andecs_compatibility
fields. the explicitness should give us more control for whatever we decide to do later (at a cost of having slightly more code in the plugins)
My current candidate(s) would be to break the builder from magically setting the event_factory
:
def register
self.event_factory = event_factory_builder.with_target(target).event_factory
# or being very explicit:
self.event_factory = EventFactoryBuilder.new(default_event_factory).with_target(target).event_factory
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early review:
Certainly on track. Modifying the Event#from_json and friends to optionally take an event factory is a really nice bridge to make the transition simple.
I have concerns with the ruby implementation's EventFactory builder, and how its use infects the plugin. I think we will need to workshop a couple of iterations on this, as my suggestions also aren't fully thought through. Since this is an API we will likely need to support for years to come, getting the boundary right will give us the freedom to swap out internals later if we find a need to do so.
I've left some comments in-line and am glad to pair with you on this if you want to drop an hour on my calendar.
end | ||
|
||
# @return an event factory | ||
def build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels subtly infectious and likely to be a source of surprises. I'm not sure we have found the right boundary.
A plugin developer should either be able to know that they are working with a targeted event factory at its call-site, or they should not need to care about it at all. As implemented, we end up in a weird hybrid where the behaviour of event_factory
is dependent on code paths that aren't immediately discoverable.
I think there are two directions we could go with this.
-
Absorb
with_target
intoEventFactory
, making its use explicit all the way throughCan we reasonably make
with_target
a part of theEventFactory
API?Something like the following would give a lot of flexibility in use while not being secretly magical.
class EventFactory ## base "abstract" class to be subclassed? def with_target(target_field_reference) return self if target_field_reference.nil? TargetedEventFactory.new(self, target_field_reference) end end
def register @targeted_event_factory = event_factory.with_target(@target) end def decode(data) events = LogStash::Event.fromJson(data, @targeted_event_factory) end
-
Make
EventFactorySupport#event_factory
always automatically respect a provided@target
. This feels even more magical and not quite fully thought through, but perhaps closer to a useful and maintainable boundary.module EventFactorySupport def event_factory @event_factory ||= begin # TODO: threadsafety? ef = default_event_factory ef = TargetedEventFactory.new(ef, @target) unless @target.nil? ef end end end
* @throws IOException when (JSON) parsing fails | ||
*/ | ||
public static Event[] fromJson(final String json) throws IOException { | ||
return fromJson(json, EventFactory.DEFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 API continuity
} | ||
} | ||
|
||
// @param value [String] the json string. A json object/map will newFromRubyArray to an array containing a single Event. | ||
// and a json array will newFromRubyArray each element into individual Event | ||
// @return Array<Event> array of events | ||
@JRubyMethod(name = "from_json", required = 1, meta = true) | ||
public static IRubyObject ruby_from_json(ThreadContext context, IRubyObject recv, RubyString value) | ||
{ | ||
public static IRubyObject ruby_from_json(ThreadContext context, IRubyObject recv, RubyString value, final Block block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a lot of back-and-forth on this, but IIRC we explicitly avoided having any jruby-related infection of the Java Event API so that consumers of the API would have a much smaller surface area. The Ruby APIs are also really hard to break confidently because we don't have good data on their usage.
expect{LogStash::Event.from_json(nil)}.to raise_error | ||
expect{LogStash::Event.new(LogStash::Json.load(nil))}.to raise_error | ||
end | ||
expect{LogStash::Event.from_json(nil)}.to raise_error # TypeError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of just commenting the error type, let's assert it. If we don't like what is being raised in practice, we should take this as an opportunity to refine the implementation. BWC-wise, we only promise to raise here so raising something more specific and specifying that here is not a breaking change.
} | ||
} | ||
|
||
// @param value [String] the json string. A json object/map will newFromRubyArray to an array containing a single Event. | ||
// and a json array will newFromRubyArray each element into individual Event | ||
// @return Array<Event> array of events | ||
@JRubyMethod(name = "from_json", required = 1, meta = true) | ||
public static IRubyObject ruby_from_json(ThreadContext context, IRubyObject recv, RubyString value) | ||
{ | ||
public static IRubyObject ruby_from_json(ThreadContext context, IRubyObject recv, RubyString value, final Block block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage of the Block and FunctionalInterface for the EventFactory is clever, and in a really nice way.
We had a call today, and I'd like to summarize what we discussed and the decisions that came out of it. We decided to remove the idea of an We decided that the scope of this PR is to introduce the
Upon writing this out, I think that:
|
to 'maintain' Java API level compatibility
d6afb30
to
81d6b71
Compare
fca920e
to
84f781c
Compare
superseded by #13017 |
What does this PR do?
This PR is motivated by having a way of re-using
LogStash::Event.from_json
withtarget => [namespace]
(e.g. in the JSON codec). We benchmarked the genericLogStash::Json
parsing (which has JRuby parsing hooks) and found it noticeably slower compared to the Java nativeEvent.fromJson
.A clean way forward would be to introduce a notion of an
EventFactory
with anewEvent
(new_event
) interface.The factory is meant to be later re-used/extended with ECS compatibility (
new_event
behaving slightly differently in legacy vs ecs mode), this is out-of-scope here, still something to keep in mind.Checklist
[ ] I have commented my code, particularly in hard-to-understand areas[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files (and/or docker env variables)How to test this PR locally
Related
follow-up adapter mixin: logstash-plugins/logstash-mixin-event_support#2