Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track Task callers in process dictionary #7995

Closed
josevalim opened this issue Jul 26, 2018 · 17 comments
Closed

Track Task callers in process dictionary #7995

josevalim opened this issue Jul 26, 2018 · 17 comments

Comments

@josevalim
Copy link
Member

Elixir tasks makes it really straight-forward for developers to execute chunks of code concurrently. This is usually done with Task.async/1:

Task.async(fn ->
  ...
end) |> Task.await()

However, in practice, developers should prefer to use Task.Supervisor.async, as the supervision tree gives more visibility over spawned tasks:

Task.Supervisor.async(MySup, fn ->
  ...
end) |> Task.await()

Unfortunately, moving to Task.Supervisor has one big downside: we lose the relationship between the process who started the task and the task itself. That's because in the first example, the parent of the task is the caller process, but in the second example, the parent is the supervisor.

This means instrumentation/monitoring/logging tools and even testing tools like Mox and the Ecto's SQL Sandbox are incapable of detecting the relationship between them.

I propose that we add a new field to the Task process dictionary, called $callers, that tracks exactly the caller of the task. This will allow tools to properly build the relationship between processes and allow testing tools that rely on ownership mechanisms to propagate those effectively without requiring tests to run synchronously.

@josevalim
Copy link
Member Author

@fishcakez I would love your opinion on this one.

@lackac
Copy link
Contributor

lackac commented Jul 26, 2018

Wouldn't there be only a single caller for any process? In that case $caller might be a better dictionary key. Apologies if I'm missing something or if this is bike-shedding territory. :)

@AndrewDryga
Copy link
Contributor

AndrewDryga commented Jul 26, 2018

I can add our use case in logging that I feel can be improved.. In order for this to work for us we would need to implement Process.get(process, key, default). The use case is following - we would like all tasks spawned from a single HTTP request to include request_id in their logs. For doing so we capture metadata (via Logger.metadata/0) and set it inside task (via Logger.metadata(metadata). We might cheat and take $callers from spawned task to figure out what caller request_id is, but currently, that's not possible.

Other way how we use process dictionary is for concurrent testing with mocs. Our HTTP clients read endpoint URL which is replaced with bypass endpoints in some tests, doing so we don't rely on global application state (eg. Application environmnet) and can run those tests concurrently.

Both of them can be solved if we would start propagating some process dictionary keys automatically to spawned processes, but I'm not sure this is a good idea because of overhead and somewhat ambiguous behavior (you still can spawn process by calling Erlang function).

@josevalim
Copy link
Member Author

@lackac excellent question. We may have a process that starts a task that starts another task. We would like to track this all the way, hence $callers. The dollar sign is to mirror the current $ancestors. Note that we cannot use $caller to get the direct caller and access the direct caller to get its direct caller because in some cases, such as in Task.start and in Task.async_nolink, the direct caller may be dead, which means this information would be lost.

@fertapric
Copy link
Member

fertapric commented Jul 27, 2018

I have created this example to illustrate what would (I think) be the content of the $callers key (hope it helps):

self() 
#=(1)> #PID<0.2.0>

Process.get(:"$ancestors")  
#=(2)> [#PID<0.1.0>]

Process.get(:"$callers")  
#=(3)> [#PID<0.1.0>]

Task.async(fn ->
  self() 
  #=(4)> #PID<0.3.0>

  Process.get(:"$ancestors")  
  #=(5)> [#PID<0.2.0>, PID<0.1.0>]

  Process.get(:"$callers")  
  #=(6)> [#PID<0.2.0>, PID<0.1.0>]

  Task.Supervisor.async(MySup, fn ->
    self() 
    #=(7)> #PID<0.4.0>

    Process.get(:"$callers")  
    #=(8)> [#PID<0.3.0>, #PID<0.2.0>, PID<0.1.0>]

    Task.async(fn ->
      self() 
      #=(9)> #PID<0.5.0>

      Process.get(:"$callers")  
      #=(10)> [#PID<0.4.0>, #PID<0.3.0>, #PID<0.2.0>, PID<0.1.0>]

      Task.Supervisor.async(MySup, fn ->
        self() 
        #=(11)> #PID<0.6.0>

        Process.get(:"$callers")  
        #=(12)> [#PID<0.5.0>, #PID<0.4.0>, #PID<0.3.0>, #PID<0.2.0>, PID<0.1.0>]
      end)
    end)
  end)
end)

I have some doubts regarding (3), because I'm not sure if the $callers key will be populated at that level. Maybe the first $callers with results would be (6)?

@josevalim
Copy link
Member Author

(3) should work but at the top level one likely won't UNLESS it was started as Task.

@michalmuskala
Copy link
Member

I like this idea. I think the question is - should this integrate with logger metadata?

@josevalim
Copy link
Member Author

@michalmuskala for now I would say no, or at least discuss it elsewhere, but we at least know this makes it possible.

@fishcakez
Copy link
Member

Is it possible to have a memory leak with $callers? It has a subtle difference with $ancestors because if an ancestor exits then the descendants must exit according to OTP. However these are tasks so Task.async would leak $ancestors in the same case right now, as a Task is not aware of its parent / doesn't have a receive loop.

Note that $ atoms in OTP use the $ sign to indicate that the atom itself and any associated term is for internal use only.

@josevalim
Copy link
Member Author

@fishcakez very good point. I see two options:

  1. Go ahead anyway since this issue exists for ancestors too

  2. Only track $caller and make sure we reset ancestors to the most immediate parent process unless running inside a supervisor

I would note that proc_lib has the same memory leak, so probably not an issue?

@fishcakez
Copy link
Member

Only track $caller and make sure we reset ancestors to the most immediate parent process unless running inside a supervisor

I think this approach won't work as well because we need to traverse up the callers stack to the first pid matching. It's possible that first or last caller may not be the caller we want to associate with.

For example if an async task performs the ownership checkout in Ecto's SQL sandbox, we don't want to get the sandbox for its parent.

Go ahead anyway since this issue exists for ancestors too

So lets go for this one.

@josevalim josevalim added this to the v1.8.0 milestone Oct 3, 2018
@sheharyarn
Copy link

Is someone working on this? I would like to take a jab at it if you can point me in the right direction!

@josevalim
Copy link
Member Author

@sheharyarn no, please go ahead!

@sheharyarn
Copy link

@josevalim Great! Any pointers/guidance on how and where I should get started?

@GregMefford
Copy link
Contributor

@sheharyarn Without having looked at the code, I believe the way this would work is that you'd:

  1. find the place in the code where Task.async and Task.Supervisor.Async get to the point of spawning a new process.
  2. Right before that happens, grab the calling process's $callers variable from its process dictionary using Process.get
  3. Prepend self() to the list
  4. In the new spawned Task process, use Process.set to set the $callers variable in the child process.

@josevalim
Copy link
Member Author

I am working on this!

josevalim pushed a commit that referenced this issue Dec 11, 2018
This allows us to track which process invoked the Task
besides its ancestors in the supervision tree.

Closes #7995
@josevalim
Copy link
Member Author

Closing in favor of #7995.

josevalim added a commit that referenced this issue Dec 11, 2018
This allows us to track which process invoked the Task
besides its ancestors in the supervision tree.

Closes #7995
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants