Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spandex_phoenix does not support several phoenix 1.5 sub-projects in the umbrella project #36

Open
ayrat555 opened this issue Aug 6, 2020 · 10 comments
Labels
good first issue Good for newcomers

Comments

@ayrat555
Copy link

ayrat555 commented Aug 6, 2020

I have two phoenix 1.5 sub-projects in the umbrella application. I added Plug.Telemetry into both of Endpoints.

defmodule App1.Endpoint do
  use Phoenix.Endpoint, otp_app: :app1

  plug(Plug.Telemetry, event_prefix: [:app1, :endpoint])
  ...
end
defmodule App2.Endpoint do
  use Phoenix.Endpoint, otp_app: :app2

  plug(Plug.Telemetry, event_prefix: [:app2, :endpoint])
...
end

Then I added SpandexPhoenix.Telemetry.install to both of these applications

    :ok =
      SpandexPhoenix.Telemetry.install(
        endpoint_telemetry_prefix: [:app1, :endpoint],
        tracer: App1.Tracer
      )
    :ok =
      SpandexPhoenix.Telemetry.install(
        endpoint_telemetry_prefix: [:app2, :endpoint],
        tracer: App2.Tracer
      )

One of them always fails with match error {:error, :already_exists} . It's an error from telemetry - https://hexdocs.pm/telemetry/telemetry.html which means the handler_id is already taken.

I suggest appending endpoint prefix to both spandex-router-telemetry and spandex-endpoint-telemetry handler ids - https://github.com/spandex-project/spandex_phoenix/blob/master/lib/spandex_phoenix/telemetry.ex#L95.

@zachdaniel zachdaniel added the good first issue Good for newcomers label Sep 11, 2020
@zachdaniel
Copy link
Member

This change would ultimately end up breaking any existing setups, so we probably want to make it a configuration that defaults to false. Anyone who wants to tackle this is welcome.

@novaugust
Copy link
Contributor

novaugust commented Sep 11, 2020

@zachdaniel it's actually a safe, non breaking change to make the telemetry listeners dynamic. but it doesn't solve the problem that all phx installs broadcast on the same keys, and so doing two installs on those would be bad. (in other words, fixing the endpoint listener to be dynamic would just kick the can to the next line, which is the router listeners)

we'd have to change the api for installing prefixes to have a separate, optional installation for adding extra endpoints listeners (without also adding more phx listeners) :/ suboptimal, but it's just the fallout of allowing users to have dynamic telemetry event names at the Plug.Telemetry level (problematic design leads to problems, sort of thing)

@zachdaniel
Copy link
Member

zachdaniel commented Sep 11, 2020

Hmm... so I get what you're saying about making the telemetry listeners dynamic.

As for the router, you're saying there isn't a way to add an event prefix to the router events?

@novaugust
Copy link
Contributor

so the problem is:

  • endpoints can each have unique event prefixes (it's just Plug.Telemetry, prefix: [:my_app, :my_endpoint])
  • routers do not have configurable prefixes (this is a good design for applications. configurable prefixes is problematic)
  • when we install our telemetry listeners, we install them for
    • endpoint (varies)
    • router (always the same)

thus if someone installs twice, once for two different endpoints, the start/stop traces will behave normally (that's done in endpoint events), but spans will get doubled up - we'll start a span and immediately start the same span, because we installed 2 listeners on the same router events. (spans are handled in the router).

@zachdaniel
Copy link
Member

So this sounds essentially like phoenix doesn't support telemetry consumers for multiple routers then...

@novaugust
Copy link
Contributor

So this sounds essentially like phoenix doesn't support telemetry consumers for multiple routers then...

Yeah, ideally endpoints and routers always emit the exact same events wrt name, and identify themselves via a metadata field.
I think we can get the router by introspecting on the conn though, so it should be doable.

FYI this is something on my to-do list, as we similarly have 2 endpoints running on a monolith that therefore use the same spandex server. Hopefully have a proof of concept by EOM

@novaugust
Copy link
Contributor

Some brainstorming, I think this'll warrant a new function, SpandexPhoenix.Telemetry.install_many/1, where you can essentially pass options you'd pass to install() multiple times, once for each endpoint you want traced.

I don't think making it a new option in install/1 is viable because people with multiple endpoints will doubtless want to make each its own service or give some a different tracer than others. That requires bundling together not only the endpoint's prefix but also all the other options...

The only unknown i have on installing different tracers for different endpoints is tracking down which endpoint each phx router dispatch event is being called for, and using the corresponding arguments. Hopefully there's something in the event's metadata i can do introspection on to match things up. If not, I'll be a bit flummoxed.

@mruoss
Copy link

mruoss commented Feb 11, 2021

I came across the same issue today. I run multiple endpoints in the same app and would like to tag the span with a different service for each of them. My current thinking to work around the issue is to create a plug which assigns the desired service name to the Plug.Conn.private field and then use the :customize_metadata callback to read that value and set the service accordingly. Any thoughts?

@GregMefford
Copy link
Member

So catching up on this thread, it looks like there are two problems:

  1. You can't actually call SpandexPhoenix.Telemetry.install/1 more than once, even with different options because this line and this line are
    registering their handlers using static values for the handler IDs. This should be relatively easy to solve because we could just use the event_prefix as part of the handler ID so that they don't collide.

  2. The remaining problem if you do step 1 is that then we'd end up registering the same [:phoenix, :router_dispatch, *] events to both handlers, so they would both receive all of the events. I just spun up a test project and confirmed that Phoenix does insert the Endpoint in the conn.private.phoenix_endpoint, in both the Plug.Telemetry and :router_dispatch events, so it might be possible to combine that somehow with config options to determine which one should go with which. I can sort of see a path forward here, but to be honest, I don't think I personally have the energy to pursue it, because at work we don't even use spandex_phoenix anymore, and we just have some internal Telemetry handlers that accomplish the same thing, giving us full control over how we want to handle each event and call Spandex ourselves. I realize that this is not helpful to the community and I'd like to work on an open-source solution that will be more helpful, but that's my current situation right now. 😅

I think what you're describing @mruoss could work as long as you work around issue 1 above. I think you could do that without any changes in Spandex if it does solve your problem. It probably wouldn't be too hard, it's just more work than you should need to do if Spandex could do it somehow for you. That will get you a solution more quickly, but if someone has the time and energy to come up with a PR that enables this, I will be happy to help test it and get it released for others to use.

@mruoss
Copy link

mruoss commented Feb 22, 2021

Hi @GregMefford

I have implemented my approach and it works. Issue 1 above is not really an issue in my case as I'm calling SpandexPhoenix.Telemetry.install/1 only once as follows:

    SpandexPhoenix.Telemetry.install(
      customize_metadata: fn conn ->
        service = Map.get(conn.private, :service, :elixir)

        conn
        |> SpandexPhoenix.default_metadata()
        |> Keyword.put(:service, service)
      end
    )

And then as described above, use a plug in the endpoint declaration to set the service using Plug.Conn.put_private/3.

So this is a working workaround, but as you say, I guess the library should handle it out of the box somehow. But I don't see that solution either at this moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

5 participants