Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputs.exec is not working,but test is working #4465

Closed
beyondht2003 opened this issue Jul 25, 2018 · 11 comments
Closed

inputs.exec is not working,but test is working #4465

beyondht2003 opened this issue Jul 25, 2018 · 11 comments
Labels
discussion Topics for discussion

Comments

@beyondht2003
Copy link

beyondht2003 commented Jul 25, 2018

Relevant telegraf.conf:

[[inputs.exec]]
commands = [
"/data/telegraf/top2influxdb.sh"
]
[Include Telegraf version, operating system name, and other relevant details]

Steps to reproduce:

[centos telegraf]$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec
2018/07/25 14:48:58 I! Using config file: /etc/telegraf/telegraf.conf

  • Plugin: inputs.exec, Collection 1

multi_proc_mycollector,tag=tagvalue,host=centos pmbank_sc_cpu%=6,pmbank_sc_mem%=10.1 1532501339000000000

  • Plugin: inputs.exec, Collection 1

multi_proc_mycollector,tag=tagvalue,host=centos pmbank_sc_cpu%=0,pmbank_sc_mem%=10.1 1532501339000000000

2018-07-25T06:50:58Z E! Error in plugin [inputs.exec]: metric parsing error, reason: [missing field value], buffer: [multi_proc,tag=tagvalue pm_cpu%!=(MISSING),pm_mem%!=(MISSING)], index: [38]

write to influxdb failed!

Expected behavior:

write to influxdb

Actual behavior:

write to influxdb failed!

the script top2influxdb.sh:

pm_pid=ps -ef|grep "pm "|grep -v grep|awk '{print $2}'
top -n 1 -d 1 -b|grep java > total_top
pm_cpu_percent=awk '$1 == "'$pm_pid'" { print $9}' total_top
pm_mem_percent=awk '$1 == "'$pm_pid'" { print $10}' total_top

echo "multi_proc,tag=tagvalue pm_cpu%=$pm_cpu_percent,pm_mem%=$pm_mem_percent"

[centos telegraf]$ sh top2influxdb.sh
multi_proc,tag=tagvalue pm_cpu%=0.0,pm_mem%=10.1

@glinton
Copy link
Contributor

glinton commented Jul 25, 2018

Why not use the procstat input plugin?

[[inputs.procstat]]
  ## executable name (ie, pgrep <exe>)
  # exe = "pm"
  ## pattern as argument for pgrep (ie, pgrep -f <pattern>)
  pattern = "pm "

I believe you can use a processor to selectively ignore fields/tags that you don't feel like you want.

@danielnelson
Copy link
Contributor

Advice to try procstat is good, but read on if you want to debug why the script is failing.

It looks like it is working when you run with --test but not when Telegraf is running as a service. This usually indicates the issue is caused by the different execution environment between these two modes, namely running as the telegraf user as well as the envionment variables and working directory.

You should be able to more closely replicate the environment like so:

sudo -H -u telegraf -s
cd $HOME
telegraf --config-directory=/etc/telegraf --test --input-filter=exec
/data/telegraf/top2influxdb.sh

I'm going to guess the issue is that you can't write to the current directory top -n 1 -d 1 -b|grep java > total_top.

@glinton glinton added the discussion Topics for discussion label Jul 25, 2018
@beyondht2003
Copy link
Author

@glinton Thank you, but the procstat input plugin can monitor one process only ? We want to monitor multiple processes.

@beyondht2003
Copy link
Author

@danielnelson Thank you~
Yes, after start inputs.exec , we could not find the file "total_top" in the directory /data/telegraf, may be inputs.exec doesn`t support file writing in script ?
We want to moniter multiple process, it seems that the procstat support only one ?

And I try this, but test ok .
[tao@centos telegraf]$ sudo -H -u telegraf -s
bash-4.1$ cd $HOME
bash-4.1$ telegraf --config-directory=/etc/telegraf --test --input-filter=exec /data/telegraf/top2influxdb.sh
2018/07/26 16:58:47 I! Using config file: /etc/telegraf/telegraf.conf

  • Plugin: inputs.exec, Collection 1

multi_proc_mycolfile,tag=tagvalue,host=centos pm_sc_mem%=10,pm_cp_cpu%=2,pm_cp_mem%=10.7,pm_sc_cpu%=4 1532595528000000000

  • Plugin: inputs.exec, Collection 1

multi_proc_mycolfile,tag=tagvalue,host=centos pm_sc_cpu%=21.8,pm_sc_mem%=10,pm_cp_cpu%=2,pm_cp_mem%=10.7 1532595528000000000
bash-4.1$

@glinton
Copy link
Contributor

glinton commented Jul 26, 2018

By defining pattern, the procstat plugin will monitor any process that matches the pattern. Regarding the file writing, was there a file located in $HOME after you ran that last test? If so, i wonder what the cwd is for telegraf's service.

@danielnelson
Copy link
Contributor

The home directory for the telegraf user is normally /etc/telegraf, and the telegraf user shouldn't be able to write to this directory.

useradd -r -M telegraf -s /bin/false -d /etc/telegraf -g telegraf

What does is the output of these commands?

sudo -H -u telegraf -s
cd $HOME
pwd
ls -ld /etc/telegraf

@beyondht2003
Copy link
Author

@danielnelson
[app@centos telegraf]$ sudo -H -u telegraf -s
bash-4.1$ cd $HOME
bash-4.1$ pwd
/etc/telegraf
bash-4.1$ ls -ld /etc/telegraf
drwxr-xr-x 3 root root 4096 Jul 26 17:48 /etc/telegraf
bash-4.1$

@danielnelson
Copy link
Contributor

Hmm, well I'm not sure why you don't get something like this:

/home/dbn/tmp: line 2: total_top: Permission denied

But the fix should be simple, use the tmp directory for scratch space and don't echo if the values are not found.:

pm_pid=`ps -ef|grep "pm "|grep -v grep|awk '{print $2}'`
top -n 1 -d 1 -b|grep java > /tmp/total_top
pm_cpu_percent=`awk '$1 == "'$pm_pid'" { print $9}' /tmp/total_top`
pm_mem_percent=`awk '$1 == "'$pm_pid'" { print $10}' /tmp/total_top`

if [ "$pm_cpu_percent" = "" ] || [ "$pm_mem_percent" = "" ]
then
	exit 0
fi

echo "multi_proc,tag=tagvalue pm_cpu%=$pm_cpu_percent,pm_mem%=$pm_mem_percent"

You should probably try using procstat too like @glinton suggested.

@beyondht2003
Copy link
Author

@danielnelson Thank you !
It works after using the tmp directory !

I will try using procstat to monitor multiple processes.

@beyondht2003
Copy link
Author

We find that procstat write to influxdb one process data per second, but we want to monitor processes "java -Dmumble.app.name=pm-cp" and "java -Dmumble.app.name=pm-sc", and so on.

Each process data is not continuous , the more process, the more interval .

select pid,process_name from procstat order by time desc limit 10
name: procstat
time pid process_name


1532681396000000000 24279 java
1532681395000000000 20058 java
1532681394000000000 24279 java
1532681393000000000 24279 java
1532681392000000000 20058 java
1532681391000000000 24279 java
1532681390000000000 24279 java
1532681389000000000 24279 java
1532681388000000000 20058 java
1532681387000000000 24279 java

@danielnelson
Copy link
Contributor

Set pid_tag = true in the procstat config, and then you can group by pid. Unfortunately it is still difficult to differentiate which process is which. If you should always have a finite set of processes you should be able to use two procstat plugins like:

[[inputs.procstat]]
  pattern = "pm-cp"

[[inputs.procstat]]
  pattern = "pm-sc"

If the number of processes is completely dynamic then we don't have a good solution for telling them apart right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Topics for discussion
Projects
None yet
Development

No branches or pull requests

3 participants