Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf crashes with the followinfg error: panic: runtime error: invalid memory address or nil pointer dereference #2061

Closed
shangeo opened this issue Nov 21, 2016 · 4 comments · Fixed by #2151
Milestone

Comments

@shangeo
Copy link

shangeo commented Nov 21, 2016

I was running telegraf version 1.0.0 on CentOS release 6.5 when it crashed. Here are the last few lines from /var/log/telegraf/telegraf.log:

2016/11/18 10:18:30 Output [influxdb] wrote batch of 1000 metrics in 23.381351ms
2016/11/18 10:18:30 Output [influxdb] wrote batch of 1000 metrics in 23.187129ms
2016/11/18 10:18:30 Output [influxdb] wrote batch of 1000 metrics in 22.511862ms
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x40 pc=0x590035]

goroutine 5344346 [running]:
panic(0x12cefe0, 0xc82000e0c0)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
github.com/influxdata/telegraf/plugins/inputs/filestat.(*FileStat).Gather(0xc820134f30, 0x7fb15bb3b710, 0xc823395770, 0x0, 0x0)
/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/plugins/inputs/filestat/filestat.go:82 +0x675
github.com/influxdata/telegraf/agent.gatherWithTimeout.func1(0xc8202ace40, 0xc820134f90, 0xc823395770)
/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:163 +0x73
created by github.com/influxdata/telegraf/agent.gatherWithTimeout
/home/ubuntu/telegraf-build/src/github.com/influxdata/telegraf/agent/agent.go:164 +0xe0

Is this fixed in version 1.1.1 ? Or are there any workarounds?

Let me know if you require any other info.

@sparrc
Copy link
Contributor

sparrc commented Nov 21, 2016

please provide your config file

@sparrc sparrc added this to the 1.2.0 milestone Nov 21, 2016
@shangeo
Copy link
Author

shangeo commented Nov 21, 2016

Here you go:

[global_tags]


[agent]

  interval = "10s"

  round_interval = true


  metric_batch_size = 1000

  metric_buffer_limit = 10000


  collection_jitter = "0s"


  flush_interval = "10s"

  flush_jitter = "0s"

  precision = ""

  debug = false

  quiet = false

  hostname = ""

  omit_hostname = false



[[outputs.influxdb]]

  urls = ["http://influxdb-hostname:8086"] # required

  database = "telegrafdb" # required

  retention_policy = ""

  write_consistency = "any"


  timeout = "5s"



[[inputs.cpu]]

  percpu = true

  totalcpu = true
  
  fielddrop = ["time_*"]


[[inputs.disk]]

  ignore_fs = ["tmpfs", "devtmpfs"]


[[inputs.diskio]]

  skip_serial_number = false


[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.swap]]

[[inputs.system]]

 [[inputs.filestat]]
   files = ["/mnt/iSCSI/**.file"]


 [[inputs.net]]


 [[inputs.netstat]]


 [[inputs.nstat]]

@coofercat
Copy link

coofercat commented Dec 13, 2016

I have the same issue. It appears to be caused when telegraf can't access the file in question due to permissions. That is, let's say you're running telegraf as an ordinary user and the file exists but is inaccessible to you, then telegraf crashes on startup (with a very cryptic message - no mention of 'permission denied' :-()

I note that a non-existent file is fine, just one that does exist that you can't access.

The problem can be 'resolved' by either running telegraf as root, or else keeping the list of files to ones that the telegraf process user can access, or by adding the telegraf user to some Unix groups so it can access the files via group permissions (although you're flat out of luck if the file's permissions mode is 0600 or similar).

@ashikaumanga
Copy link

ashikaumanga commented Oct 6, 2017

@shangeo

I am having this issue when receiving Metrics from Flink. I am using telegraf version Telegraf v1.2.0
Any tips?

2017-10-06T06:06:41Z E! Error: parsing value to float64: someserver101.taskmanager.53c74f35b7540a32557f494d6b5bcb21.Kafka_To_HDFS_Application.Source- Custom Source.1.latency:{}|g 2017-10-06T06:06:41Z E! Error: parsing value to float64: someserver101.taskmanager.53c74f35b7540a32557f494d6b5bcb21.Kafka_To_HDFS_Application.Sink- Unnamed.1.latency:{LatencySourceDescriptor{vertexID=1, subtaskIndex=1}={p99=3979.3999999999696, p50=0.0, min=0.0, max=5383.0, p95=1.0, mean=46.484375}}|g 2017-10-06T06:06:41Z E! Error: parsing value to float64: someserver101.taskmanager.e3567fa0e5e7378637c481364c6dc36d.Kafka_To_HDFS_Application.Source- Custom Source.0.latency:{}|g 2017-10-06T06:06:41Z E! Error: parsing value to float64: someserver101.taskmanager.e3567fa0e5e7378637c481364c6dc36d.Kafka_To_HDFS_Application.Sink- Unnamed.0.latency:{LatencySourceDescriptor{vertexID=1, subtaskIndex=0}={p99=3843.5199999999663, p50=1.0, min=0.0, max=5413.0, p95=1.0, mean=43.109375}}|g 2017-10-06T06:06:41Z E! Error: parsing value to float64: someserver101.taskmanager.7763dc21543c31eda367f4e9f229481b.Kafka_To_HDFS_Application.Source- Custom Source.2.latency:{}|g 2017-10-06T06:06:41Z E! Error: parsing value to float64: someserver101.taskmanager.7763dc21543c31eda367f4e9f229481b.Kafka_To_HDFS_Application.Sink- Unnamed.2.latency:{LatencySourceDescriptor{vertexID=1, subtaskIndex=2}={p99=0.0, p50=0.0, min=0.0, max=0.0, p95=0.0, mean=0.0}}|g 2017-10-06T06:06:47Z E! Error: splitting '|', Unable to parse metric: 127.0.0.1.jobmanager.Kafka_To_HDFS_Application.lastCheckpointExternalPath:file:/usr/local/flink/repository/dss_ns1/checkpoint_metadata/checkpoint_metadata-8c4356e99488|g panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x654edb]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants