-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track Task callers in process dictionary #7995
Comments
@fishcakez I would love your opinion on this one. |
Wouldn't there be only a single caller for any process? In that case |
I can add our use case in logging that I feel can be improved.. In order for this to work for us we would need to implement Other way how we use process dictionary is for concurrent testing with mocs. Our HTTP clients read endpoint URL which is replaced with Both of them can be solved if we would start propagating some process dictionary keys automatically to spawned processes, but I'm not sure this is a good idea because of overhead and somewhat ambiguous behavior (you still can spawn process by calling Erlang function). |
@lackac excellent question. We may have a process that starts a task that starts another task. We would like to track this all the way, hence |
I have created this example to illustrate what would (I think) be the content of the self()
#=(1)> #PID<0.2.0>
Process.get(:"$ancestors")
#=(2)> [#PID<0.1.0>]
Process.get(:"$callers")
#=(3)> [#PID<0.1.0>]
Task.async(fn ->
self()
#=(4)> #PID<0.3.0>
Process.get(:"$ancestors")
#=(5)> [#PID<0.2.0>, PID<0.1.0>]
Process.get(:"$callers")
#=(6)> [#PID<0.2.0>, PID<0.1.0>]
Task.Supervisor.async(MySup, fn ->
self()
#=(7)> #PID<0.4.0>
Process.get(:"$callers")
#=(8)> [#PID<0.3.0>, #PID<0.2.0>, PID<0.1.0>]
Task.async(fn ->
self()
#=(9)> #PID<0.5.0>
Process.get(:"$callers")
#=(10)> [#PID<0.4.0>, #PID<0.3.0>, #PID<0.2.0>, PID<0.1.0>]
Task.Supervisor.async(MySup, fn ->
self()
#=(11)> #PID<0.6.0>
Process.get(:"$callers")
#=(12)> [#PID<0.5.0>, #PID<0.4.0>, #PID<0.3.0>, #PID<0.2.0>, PID<0.1.0>]
end)
end)
end)
end) I have some doubts regarding (3), because I'm not sure if the |
(3) should work but at the top level one likely won't UNLESS it was started as Task. |
I like this idea. I think the question is - should this integrate with logger metadata? |
@michalmuskala for now I would say no, or at least discuss it elsewhere, but we at least know this makes it possible. |
Is it possible to have a memory leak with Note that $ atoms in OTP use the $ sign to indicate that the atom itself and any associated term is for internal use only. |
@fishcakez very good point. I see two options:
I would note that proc_lib has the same memory leak, so probably not an issue? |
I think this approach won't work as well because we need to traverse up the callers stack to the first pid matching. It's possible that first or last caller may not be the caller we want to associate with. For example if an async task performs the ownership checkout in Ecto's SQL sandbox, we don't want to get the sandbox for its parent.
So lets go for this one. |
Is someone working on this? I would like to take a jab at it if you can point me in the right direction! |
@sheharyarn no, please go ahead! |
@josevalim Great! Any pointers/guidance on how and where I should get started? |
@sheharyarn Without having looked at the code, I believe the way this would work is that you'd:
|
I am working on this! |
This allows us to track which process invoked the Task besides its ancestors in the supervision tree. Closes #7995
Closing in favor of #7995. |
This allows us to track which process invoked the Task besides its ancestors in the supervision tree. Closes #7995
Elixir tasks makes it really straight-forward for developers to execute chunks of code concurrently. This is usually done with
Task.async/1
:However, in practice, developers should prefer to use
Task.Supervisor.async
, as the supervision tree gives more visibility over spawned tasks:Unfortunately, moving to
Task.Supervisor
has one big downside: we lose the relationship between the process who started the task and the task itself. That's because in the first example, the parent of the task is the caller process, but in the second example, the parent is the supervisor.This means instrumentation/monitoring/logging tools and even testing tools like Mox and the Ecto's SQL Sandbox are incapable of detecting the relationship between them.
I propose that we add a new field to the Task process dictionary, called
$callers
, that tracks exactly the caller of the task. This will allow tools to properly build the relationship between processes and allow testing tools that rely on ownership mechanisms to propagate those effectively without requiring tests to run synchronously.The text was updated successfully, but these errors were encountered: