-
Notifications
You must be signed in to change notification settings - Fork 17
Update the shipper to work with the elastic-agent #185
Update the shipper to work with the elastic-agent #185
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
This pull request is now in conflicts. Could you fix it? 🙏
|
Here's the draft PR for the same changes in Fleet: elastic/kibana#145755 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, still need to test this myself.
"github.com/elastic/elastic-agent-shipper/config" | ||
) | ||
|
||
// I am not net sure how this should work or what it should do, but we need to read from that error channel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should log the error, it's a way for the agent to report failures making gRPC calls to somewhere that can log them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also reading this I realized that Beats doesn't do this elastic/beats#34381
Okay, so, there's two ongoing issues related to this PR:
|
SetStrictMode is a development only option. We can consider removing it if it is adding complexity, or perhaps only support it when the config is loaded from a file if that is possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next round of comments, many of them about testing.
controller/run.go
Outdated
@@ -31,15 +43,27 @@ func LoadAndRun() error { | |||
} | |||
|
|||
// RunUnmanaged runs the shipper out of a local config file without using the control protocol. | |||
func RunUnmanaged(cfg config.ShipperConfig) error { | |||
func RunUnmanaged(cfg config.ShipperRootConfig) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any tests for running in unmanaged mode? What is enforcing that it keeps working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, I've been deprioritizing it a bit, since it took me so long to just get the shipper talking to other components. Will see if I can add a test or two, at least...
@@ -408,3 +405,7 @@ func (p *publisherMock) TryPublish(event *messages.Event) (queue.EntryID, error) | |||
func (p *publisherMock) PersistedIndex() (queue.EntryID, error) { | |||
return p.persistedIndex, nil | |||
} | |||
|
|||
func (p *publisherMock) IsInitialized() bool { | |||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should probably be a test of the case where this is false
.
Okay, updating this as I go, in case more people want to look at it between now and tomorrow. Biggest thing that still needs to be done is expanding the tests. |
Okay, added a bunch of tests. @cmacknz going through and verifying agent behavior, then writing tests, is proving to be a bit of a laborious process, if we're worried about blocking the performance testing, we might want to decide what cases we want to be handled here, and what we want to add in a followup PR. |
// figure out if we need to initialize the gRPC endpoint | ||
// TODO: this is another thing that will change with https://github.com/elastic/elastic-agent-shipper/issues/225, | ||
// As we'll have a dedicated unit for the input, and we won't be using random input unit updates to see if we need to update the gRPC endpoint. | ||
c.startgRPC(unit, conn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could change over the lifetime of the shipper. If for example a CA or server certificate expires and needs to be updated. But I think in general, both the server side and output side should expect to be restarted independently.
I would rather only do the performance tests once we are convinced this implementation is correct. |
Alright, we should be covering all the scenarios we want in testing now. Going to do a few more tests against actual elastic-agent tomorrow, just to make sure I didn't miss anything. |
Alright, tested with elastic-agent and flipping various units on and off. With all the bugs in the shipper output/ES components, actual end-to-end testing is sort of a non-starter, but most of the fixes with that are in-progress. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of mostly minor comments, this is looking good.
What does this PR do?
This closes #161 and updates the shipper's V2 client code to work with the elastic-agent. There's a number of missing pieces:
Limit of mapping depth [20] has been exceeded due to object field [fields.data.system.Kind.StructValue.data.cpu.Kind.StructValue.data.idle.Kind.StructValue.data.norm.Kind.StructValue.data.pct.Kind]
Because of the above issues, this code is somewhat untested. I would like to expand the available integration tests, but for now, the code is testable, with the note that some beat events will not be sent to the shipper due to elastic/beats#34319.
How to Test
In order to test, you can run elastic-agent in standalone mode with an output config like this:
You may also need to place the
shipper
binary andshipper.spec.yml
file under thecomponents/
subdirectory.Other open Questions: