Reading stats for all running processes is inefficient #413

conorbranagan · 2017-08-05T19:59:34Z

Hi,

We are using the gopsutil as part of a tool that's collecting information about all running processes. With the existing gopsutil public APIs in process_* you often end up doing the same work several times to get different data. For example calling p.Status() and p.UIDs() reads through /proc/$pid/status twice.

Right now we've forked and added an which adds an AllProcesses function (https://github.com/DataDog/gopsutil/blob/dd/process/process_linux.go#L794) that performs each proc file read once and then populates a FilledProcess. It's certainly not perfect and ideally we'd like to stay in sync with upstream and make this work well in the normal case.

So the questions are: have you given any thought to this use case? Is adding AllProcesses something that might be able to be accepted upstream OR could we change the existing API so it's possible to get all the fields we want with the fewest number of syscalls.

Thanks!

The text was updated successfully, but these errors were encountered:

shirou · 2017-08-06T00:33:58Z

Because if a user want to read just a NumFD, getting other information is waste. So I separate it.

However, I know this is not efficient if user like you want all information related to the process. Then, I agree with your AllProcesses and will accept if it is implemented all of the supported platforms.

And how about introduce a cache mechanism to AllProcesses?

conorbranagan · 2017-08-06T00:50:32Z

Yes I you are definitely right that in most cases you're just getting a couple bits of information and pulling everything does not make sense.

That's great news that you might accept AllProcesses upstream. With that in mind I'll work on clean up what we have and getting it to support other platforms (so far we just have linux, mac and freebsd) and see what you think. Thanks for the quick response!

shirou · 2017-08-07T12:50:12Z

I think cache can fascinate other users to use AllProcesses. Perhaps #286 might be helpful.

conorbranagan mentioned this issue Sep 20, 2017

Import docker util DataDog/datadog-agent#585

Merged

Lomanic added os:linux package:process labels Dec 20, 2017

Lomanic mentioned this issue Apr 28, 2018

add snapshot feature for collecting process info at a point in time #517

Closed

Lomanic mentioned this issue Mar 3, 2020

Processes() is extremely CPU intensive #842

Open

1 task

Lomanic added the performance label Oct 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading stats for all running processes is inefficient #413

Reading stats for all running processes is inefficient #413

conorbranagan commented Aug 5, 2017 •

edited

Loading

shirou commented Aug 6, 2017

conorbranagan commented Aug 6, 2017

shirou commented Aug 7, 2017

Reading stats for all running processes is inefficient #413

Reading stats for all running processes is inefficient #413

Comments

conorbranagan commented Aug 5, 2017 • edited Loading

shirou commented Aug 6, 2017

conorbranagan commented Aug 6, 2017

shirou commented Aug 7, 2017

conorbranagan commented Aug 5, 2017 •

edited

Loading