Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add location fields to root of config #8620

Closed
wants to merge 3 commits into from

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Oct 15, 2018

What

This WIP PR improves the experience of monitoring beats across multiple geographic and/or network boundaries by adding two new fields that let users determine where a check was made from.

Initially, I was only going to scope these changes to heartbeat, but it makes more sense from a product and a code perspective to scope these to libbeat. Every beat can benefit from these fields. The key benefit here is that a user can use this info to determine where problems may be occurring.

The new conf/event fields are:

  1. location: string: A user-defined keyword that has significance to them. Say, london, network_dmz, or datacenter_a.
  2. geo.location: geo_point: Used to tag the monitoring event with an actual geo point.

So, one could have a heartbeat config like the following:

heartbeat.monitors:
- type: http
  urls: ["http://localhost:9200"]

location: minneapolis
geo.location: "44.8897,-93.3500"

Why

Consider the following use case. A user has multiple data centers running various beats. If a network partition occurs, may a carrier link drops, one data center may stop sending data. The information here would make that apparent using either terms on the location field, or with a map using geo.location in Kibana.

In another scenario, a user might want to compare the relative throughput in terms of log lines from filebeats in one datacenter vs. another.

Note that we only set geo.location, not the other ECS geo fields like City. The worry I had there is that users would need ensure they input consistent info like city etc. That feels like perhaps a subsequent iteration to solve. The coordinates are the most important.

Why We Need An Explicit Field

Users could accomplish the same thing in an ad-hoc way using fields and tags. However, this is such a common problem that we benefit from having an explicit field. Having explicit fields means that this value is consistent across our products, and can be used in dashboards and other common tools.

TODOS ( if the general concept is approved):

  • Allow map formatted geopoints
  • Make the publisher code that sets the geo point values more optimized (we might be able to create fewer objects per event)
  • Add tests

@andrewvc andrewvc added enhancement in progress Pull request is currently in progress. discuss Issue needs further discussion. libbeat Heartbeat labels Oct 15, 2018
@andrewvc andrewvc requested review from urso and ruflin October 15, 2018 20:58
@ruflin
Copy link
Contributor

ruflin commented Oct 16, 2018

Perhaps we can group these configs under agent.* to not have too many top level entries. This also makes it obvious that it's the location of the agent. Also in the event I don't think it should go on the top level directly but potentially under agent.*. All the info we have under beat.* at the moment will end up there.

It's not clear to me yet how location will be used. Is this a description of the geo.location?

@andrewvc
Copy link
Contributor Author

@ruflin the location field is intended to be somewhat vague, it is up to the user to determine what makes sense. I considered titling this vantage, because it's really about the POV of the beat, but I decided that was too obscure a word.

I tried to explain this above, but it must not have been as clear as I thought. So, some names could be:

  1. Based on city, if you have DCs in different cities, e.g. boston
  2. Based on datacenter name + loc e.g. equinix-boston
  3. Based on network location e.g. prod-subnet-infra

It's up to the user what's useful for segmenting this beat.

geo.location is an ECS standard field that would be used for things like maps.

@andrewvc
Copy link
Contributor Author

Can you elaborate as to what you mean by having too much stuff under the root? What's the criteria for too much, and what was the process behind choosing agent as a place to nest that.

Why would we want to nest new options under an arbitrary key like agent? If there are too many top level options wouldn't we want to create a consistent system of assigning hierarchy instead of just putting new stuff under a special key?

@ruflin
Copy link
Contributor

ruflin commented Oct 16, 2018

I hope it's not arbitrary. I asked myself what location does this location field describe. I think it's where the agent is running, in this case the Beat. An event itself can also contain location information but it could be origin or the destination of the event.

In ECS I would see the data going under agent: https://github.com/elastic/ecs#agent and would think because of this also config under agent would make sense.

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 16, 2018 via email

@andrewvc
Copy link
Contributor Author

@ruflin will we eventually be moving the beat.* fields to agent.* (cc @urso) ? Looking at the ECS spec that seems to be a thing perhaps?

It'll be awkward to have this field in agent, but those elsewhere. Maybe that could be a blocker for this field?

@ruflin
Copy link
Contributor

ruflin commented Oct 17, 2018

The beat.* fields will be moved under agent.* for 7.0, I plan to start this work next week.

@andrewvc
Copy link
Contributor Author

@ruflin given that, and the fact that there isn't a burning need to get this into 6.x, maybe I should just target 7.0 with this patch.

I think it will be awkward to have things spread across beat and agent.

@ruflin
Copy link
Contributor

ruflin commented Oct 17, 2018

Agree, 7.0 should be good. If we see we will need in 6.6 for some reasons we can still figure out the details. I still need a bit of time to think on the location field TBH.

@andrewvc
Copy link
Contributor Author

@ruflin if you're having doubts about whether we should add this feature, let me table work here for now then. Please post your thoughts when you have a chance.

@andrewvc
Copy link
Contributor Author

andrewvc commented Nov 8, 2018

@ruflin ping!

@andrewvc andrewvc mentioned this pull request Nov 9, 2018
7 tasks
@urso
Copy link

urso commented Nov 16, 2018

I totally see value in having locatisation information available in the UI, but I think I'd prefer a processor (we already enable a few processors by default). Long term I'd also prefer to move tags and fields configs as they exist today into explicit processors as well.

As users put in the geolocation manually, we should provide some additional validation.

Ultimately users can use the fields for these kind of constant information. These are constant settings and we do not acquire these information from somewhere. Do we really need to introduce new settings/processor, or should we rather provide documentation/blog post on some good practices for different kind of heartbeat deployment/usage scenarios?

@ruflin
Copy link
Contributor

ruflin commented Nov 19, 2018

I like the idea of introducing a processor like add_location_metadata. This will allow the users to use it either globally or locally as part of each monitoring / prospector / module if only some should be enriched.

@andrewvc andrewvc added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Nov 26, 2018
@urso
Copy link

urso commented Nov 27, 2018

+1 on processor, but only if these information can be acquired in an automated fashion. Otherwise it's just another add_fields processor, only with a limited set of allowed fields.

@andrewvc
Copy link
Contributor Author

andrewvc commented Nov 27, 2018 via email

@ruflin
Copy link
Contributor

ruflin commented Nov 29, 2018

The geo location we add is about the agent (beat). Thinking of ECS the information should be stored under agent:

agent.geo.location: "44.8897,-93.3500"
agent.geo.name: "Zurich"

Writing this I realised we don't have a name field which can be user configured in geo yet but it would fit well with other name fields we introduced recently.

I think we should introduce a processor for it to ensure the fields always follow ECS and don't leave it to the user to define where the fields go. Also in the future I see potential to automatically set a basic location on time zone or some server information. But I would start with manual configuration.

@ruflin
Copy link
Contributor

ruflin commented Nov 29, 2018

@andrewvc If we agree geo.name is an interesting field, we should open an issue for it in ECS. @webmat FYI

@webmat
Copy link
Contributor

webmat commented Nov 29, 2018

@ruflin Not sure what should go in geo.name, it sounds to generic. We're definitely missing some fields under geo, but what's the gist of geo.name?

@ruflin
Copy link
Contributor

ruflin commented Nov 29, 2018

@webmat As we with host.name or device.name it's a field for user defined content / name.

@webmat
Copy link
Contributor

webmat commented Nov 29, 2018

Ok, that's what I thought. So kind of their preferred granularity. If they care about city level, it would be cities, if they care about country level it would be country names.

@urso
Copy link

urso commented Nov 29, 2018

Other possible locations: building, floor, room, rack, slot ;)

@andrewvc
Copy link
Contributor Author

andrewvc commented Nov 29, 2018

I love geo.name. I agree with @urso that could be anything: us-east-1a, headquarters-dc-chicago, under the lazy suzan at the deli next door, etc.

+1 to making it an ECS field. I'll modify this PR to use that and add a processor.

@ruflin ruflin added Team:obs-ds-hosted-services Label for the Observability Hosted Services team and removed Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Dec 3, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime

@ruflin
Copy link
Contributor

ruflin commented Dec 4, 2018

I also had in mind fields like @urso mentioned.

@andrewvc Could you also open an issue / PR in ECS?

@andrewvc
Copy link
Contributor Author

andrewvc commented Dec 5, 2018

I've opened a new PR with the processor. Closing this PR in favor of #9392 . Let's continue the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. enhancement Heartbeat in progress Pull request is currently in progress. libbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants