Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add the ability to query for windows processes by Name and by cmdline #468

Closed
wants to merge 2 commits into from

Conversation

vrecan
Copy link
Contributor

@vrecan vrecan commented Dec 10, 2017

I need the ability to query for processes on windows through WMI by command or by process name. getting all Processes and then filtering takes over 6 seconds on my i7 7700k 32gb system. When I change it to filter through the WMI query I get the processes back in 0.04s.

@vrecan
Copy link
Contributor Author

vrecan commented Dec 10, 2017

this is needed to add windows support to telegraf for procstats. influxdata/telegraf/issues/3558

Let me know if there is anything else you would like me to do to get this merged!

@vrecan
Copy link
Contributor Author

vrecan commented Dec 12, 2017

Would you be ok if I added a test file for windows for process stats? I have this working through telegraf with tests inside it but I would prefer to have tests in here as well.

@shirou
Copy link
Owner

shirou commented Dec 14, 2017

I'm sorry but this is only for Windows. gopsutil aims to be multiplatform, this can not accepted. Or, if you implement for other platform especially linux, freebsd, and darwin, I will merge.

anyway, thank you for your contribution!

@vrecan
Copy link
Contributor Author

vrecan commented Dec 14, 2017

That doesn't make much sense to me. The function that is already there GetWin32Proc() doesn't exist in linux either but it's in gopsutil? This is just a convenience function that massively improves performance when querying for procs by name or cmdLine. Using this function it takes around 0.04 seconds without it, I have to emulate it using Process() and then iterate over all of the names or command lines. On my system that takes 6 Seconds!

What can I do to get something like this in? Six seconds is not reasonable for my problem.

I am open to idea's but I am stuck here... How can we improve performance for these use cases?

@shirou
Copy link
Owner

shirou commented Dec 14, 2017

GetWin32Proc() should not be exposed. I'll change it to getWin32Proc() later.

@Lomanic
Copy link
Collaborator

Lomanic commented Dec 14, 2017

I am open to idea's but I am stuck here... How can we improve performance for these use cases?

Solution is to get rid of the slow WMI calls in gopsutil in Process.Cmdline() (and other functions), for example by porting python's psutil C to Go syscalls to use Windows API. The solution to the slow WMI calls is not to add more (specialized) WMI calls. It is not wise to add X Windows-only public functions for each property accessible by WMI… where downstream you can simply use github.com/stackexchange/wmi if you want this kind of request against WMI.

process.Name() already uses windows API and is relatively fast since #456, see benchmark below.

package main

import (
	"github.com/shirou/gopsutil/process"
	"log"
	"time"
)

func main() {
	start := time.Now()
	procs, err := process.Processes()
	if err != nil {
		log.Fatalln("Could not read processes,", err)
	}
	for _, proc := range procs {
		name, err := proc.Name()
		if err != nil {
			log.Fatalln("Could not read process name,", err)
		}
		_ = name
	}
	end := time.Now()
	log.Println("It took", end.Sub(start), "to get", len(procs), "results,", end.Sub(start) / time.Duration(len(procs)), "per process")
}

2017/12/14 12:17:40 It took 858.5184ms to get 71 results, 12.091808ms per process (approx 10ms / proc).

@vrecan
Copy link
Contributor Author

vrecan commented Dec 15, 2017

Thanks for the responses! I can't use cgo because the telegraf maintainer says they don't build it with cgo enabled. Maybe I can do this with just the WMI api that you have without making any changes to gopsutil.

Thanks for your help!

@danielnelson
Copy link
Contributor

Correct me if I'm wrong, but I don't think the example code uses Cgo. That said, 858ms is a fair chunk of time, and by way of comparison on linux I can run the same code to get:

2017/12/15 11:44:39 It took 6.941232ms to get 225 results, 30.849µs per process

@Lomanic If we wanted to optimize search by name/cmdline/user downstream in Telegraf, would you recommend then recommend the specialized WMI approach?

@vrecan
Copy link
Contributor Author

vrecan commented Dec 18, 2017

If it's not using CGO then I would prefer this approach as well. If it's fast enough to get the info for all procs then I can do the filtering without the WMI call. I would prefer to not use WMI at all because I have found it extremely flaky, especially when certain antivirus programs are running (carbon black). They all tend to hammer WMI causing strange issues on systems.

I do still need the flexibility to get procs by cmdLine. It's the easiest way to pull out java, node, python processes without them all just saying "java.exe"

@Lomanic
Copy link
Collaborator

Lomanic commented Dec 18, 2017

@vrecan @danielnelson, sorry for the misunderstanding, I'm not talking about using cgo in gopsutil that @shirou wants to keep pure go (with some OS-specific exceptions) with which I agree, but looking at how python psutil does things to port them to windows syscalls (like some functions that do not rely on WMI currently, such as process.Name()) in gopsutil.

Let me clarify my intention, I disagree about adding these two functions to gopsutil for the following reasons:

  1. they are Windows-only where gopsutil aims to be be platform-agnostic
  2. they don't solve the root problem of performance in process.Cmdline() in Windows (the slow individual WMI calls)
  3. this new public interface is another liability where I think we should remove the parts where WMI is used to get informations on individual processes on Windows which are really slow.

@danielnelson, you are comparing two completely different codebases here (Windows: syscalls to Windows system DLLs and Linux that mostly reads in /proc) on two completely different machines, but I agree that Linux by its simple approach of reading processes stats from files does not suffer from the overheads of the Windows API calls. I don't think there is much to do in gopsutil on Windows unfortunately (using undocumented and subject-to-change-in-the-future PEB is sub-optimal for the very reason that it is undocumented and subject-to-change-in-the-future).

Indeed, to optimize downstream in Telegraf, as (as I understand it) you already have Windows-specific code, I think you should filter processes by commandline/name (a benchmark might be needed to compare against a loop like my previous example or a single WMI query to filter by process name) via WMI, directly using StackExchange pkg, which is what this PR does, instead of adding these helpers in gopsutil for the reasons stated in this message.

@danielnelson
Copy link
Contributor

@Lomanic Thanks for the advice, and we have a good point to add this functionality to Telegraf itself so it's no problem. @vrecan In light of this, I guess we can close this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants