Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer: undefined method `+' for nil:NilClass #1999

Closed
johanneswuerbach opened this issue May 21, 2018 · 3 comments · Fixed by #2016
Closed

Buffer: undefined method `+' for nil:NilClass #1999

johanneswuerbach opened this issue May 21, 2018 · 3 comments · Fixed by #2016

Comments

@johanneswuerbach
Copy link

johanneswuerbach commented May 21, 2018

From time to time fluentd randomly crashes with a "undefined method `+' for nil:NilClass" for us.

We use

fluentd 1.2.0
ruby 2.3.3p222 (2016-11-21) [x86_64-linux-gnu]
splunk plugin https://github.com/brycied00d/fluent-plugin-splunk-http-eventcollector/commit/d8691830a5592db473bd870fc5b8b0d13fbf3dbc (latest master)

with the following config

<match splunk_event.**>
  @id out_splunk
  @type splunk-http-eventcollector
  server "#{ENV["SPLUNK_HOST"]}"
  token "#{ENV["SPLUNK_TOKEN"]}"
  index "#{ENV["SPLUNK_ENVIRONMENT"]}"
  source ${tag_parts[1]}
  sourcetype ${tag_parts[2]}
  all_items true
  batch_size_limit 26214400 # Write up 25mb batches
  <buffer>
    @type file
    path /var/log/fluentd-buffers/forward.buffer
    flush_interval 1s
    flush_thread_count 8
    overflow_action block
    chunk_limit_size 20mb
  </buffer>
</match>

and the full logs are:

2018-05-21 02:55:50 +0000 [error]: [out_splunk] error on output thread error_class=NoMethodError error="undefined method `+' for nil:NilClass"
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/buffer.rb:478:in `block in takeback_chunk'
  2018-05-21 02:55:50 +0000 [error]: /usr/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/buffer.rb:473:in `takeback_chunk'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:1144:in `rescue in try_flush'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:1079:in `try_flush'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-05-21 02:55:50 +0000 [warn]: [out_splunk] thread exited by unexpected error plugin=Fluent::SplunkHTTPEventcollectorOutput title=:flush_thread_5 error_class=NoMethodError error="undefined method `+' for nil:NilClass"
2018-05-21 02:55:50 +0000 [error]: unexpected error error_class=NoMethodError error="undefined method `+' for nil:NilClass"
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/buffer.rb:478:in `block in takeback_chunk'
  2018-05-21 02:55:50 +0000 [error]: /usr/lib/ruby/2.3.0/monitor.rb:214:in `mon_synchronize'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/buffer.rb:473:in `takeback_chunk'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:1144:in `rescue in try_flush'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:1079:in `try_flush'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-05-21 02:55:50 +0000 [error]: /opt/fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.2.0/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-05-21 02:55:50 +0000 [error]: unexpected error error_class=NoMethodError error="undefined method `+' for nil:NilClass"

The issue looks similar to #1187, but the splunk plugin is still using the 0.12 interface if I'm not mistaken.

Happy to provide any additional information, but I'm not really sure how to debug this further.

@abicky
Copy link
Contributor

abicky commented Jun 10, 2018

I encountered the same issue.

Steps to reproduce

1. Put the following files

fluent.conf

<source>
  @type dummy
  tag dummy
  rate 2
</source>

<match dummy>
  @type dummy
  <buffer time>
    timekey 2s
    chunk_limit_size 64
    flush_thread_count 16
    flush_interval 1
    flush_mode interval
    retry_max_interval 1
  </buffer>
</match>

lib/fluent/plugin/out_dummy.rb

require 'fluent/plugin/output'

module Fluent::Plugin
  class DummyOutput < Output
    Fluent::Plugin.register_output('dummy', self)

    def write(chunk)
      raise "error" if rand(2).zero?
    end
  end
end

2. Launch fluentd

% git rev-parse HEAD
aa848dedbc8fc5248f2a005a3ac2a33d70f696d9
% bundle exec ./bin/fluentd -c ./fluent.conf
2018-06-11 03:07:34 +0900 [info]: parsing config file is succeeded path="./fluent.conf"
2018-06-11 03:07:34 +0900 [warn]: both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2018-06-11 03:07:34 +0900 [warn]: both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2018-06-11 03:07:34 +0900 [info]: using configuration file: <ROOT>
  <source>
    @type dummy
    tag "dummy"
    rate 2
  </source>
  <match dummy>
    @type dummy
    <buffer time>
      timekey 2s
      chunk_limit_size 64
      flush_thread_count 16
      flush_interval 1
      flush_mode interval
      retry_max_interval 1
    </buffer>
  </match>
</ROOT>
2018-06-11 03:07:34 +0900 [info]: starting fluentd-1.2.1 pid=8350 ruby="2.4.2"
2018-06-11 03:07:34 +0900 [info]: spawn command to main:  cmdline=["/Users/arabiki/.anyenv/envs/rbenv/versions/2.4.2/bin/ruby", "-Eascii-8bit:ascii-8bit", "-rbundler/setup", "./bin/fluentd", "-c", "./fluent.conf", "--under-supervisor"]
2018-06-11 03:07:35 +0900 [info]: gem 'fluentd' version '1.2.1'
2018-06-11 03:07:35 +0900 [info]: adding match pattern="dummy" type="dummy"
2018-06-11 03:07:35 +0900 [info]: adding source type="dummy"
2018-06-11 03:07:35 +0900 [warn]: #0 both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2018-06-11 03:07:35 +0900 [warn]: #0 both of Plugin @id and path for <storage> are not specified. Using on-memory store.
2018-06-11 03:07:35 +0900 [info]: #0 starting fluentd worker pid=8379 ppid=8350 worker=0
2018-06-11 03:07:35 +0900 [info]: #0 fluentd worker is now running worker=0
2018-06-11 03:07:37 +0900 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2018-06-11 03:07:38 +0900 chunk="56e4d81fe0c95446807871d7f704a136" error_class=RuntimeError error="error"
  2018-06-11 03:07:37 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/out_dummy.rb:8:in `write'
  2018-06-11 03:07:37 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1099:in `try_flush'
  2018-06-11 03:07:37 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-06-11 03:07:37 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-06-11 03:07:37 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-06-11 03:07:38 +0900 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2018-06-11 03:07:38 +0900 chunk="56e4d81fe0c95446807871d7f704a136" error_class=RuntimeError error="error"
  2018-06-11 03:07:38 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/out_dummy.rb:8:in `write'
  2018-06-11 03:07:38 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1099:in `try_flush'
  2018-06-11 03:07:38 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-06-11 03:07:38 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-06-11 03:07:38 +0900 [warn]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-06-11 03:07:38 +0900 [warn]: #0 retry succeeded. chunk_id="56e4d81fe0c95446807871d7f704a136"
(snip)
2018-06-11 03:08:23 +0900 [error]: #0 error on output thread error_class=NoMethodError error="undefined method `+' for nil:NilClass"
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/buffer.rb:478:in `block in takeback_chunk'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/.anyenv/envs/rbenv/versions/2.4.2/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/buffer.rb:473:in `takeback_chunk'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1144:in `rescue in try_flush'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1079:in `try_flush'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-06-11 03:08:23 +0900 [warn]: #0 thread exited by unexpected error plugin=Fluent::Plugin::DummyOutput title=:flush_thread_0 error_class=NoMethodError error="undefined method `+' for nil:NilClass"
2018-06-11 03:08:23 +0900 [error]: #0 unexpected error error_class=NoMethodError error="undefined method `+' for nil:NilClass"
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/buffer.rb:478:in `block in takeback_chunk'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/.anyenv/envs/rbenv/versions/2.4.2/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/buffer.rb:473:in `takeback_chunk'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1144:in `rescue in try_flush'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1079:in `try_flush'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-06-11 03:08:23 +0900 [error]: #0 /Users/arabiki/ghq/src/github.com/fluent/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-06-11 03:08:23 +0900 [error]: #0 unexpected error error_class=NoMethodError error="undefined method `+' for nil:NilClass"
  2018-06-11 03:08:23 +0900 [error]: #0 suppressed same stacktrace
2018-06-11 03:08:23 +0900 [info]: Worker 0 finished unexpectedly with status 1

@abicky
Copy link
Contributor

abicky commented Jun 10, 2018

This is a regression of #1919 and occurs when Buffer#takeback_chunk is called after Buffer#purge_chunk deletes @queued_num[metadata] like below:

  1. flush_thtead_1: try_flush & dequeue_chunk
    • @queued_num[metadata] #=> 1
  2. flush_thread_2: try_flush & dequeue_chunk
    • @queued_num[metadata] #=> 0
  3. flush_thread_1: commit_write & purge_chunk
    • @queued_num.delete(metadata)
  4. flush_thread_2: takeback_chunk

abicky added a commit to abicky/fluentd that referenced this issue Jun 10, 2018
abicky added a commit to abicky/fluentd that referenced this issue Jun 10, 2018
@abicky
Copy link
Contributor

abicky commented Jun 10, 2018

I've created a PR #2016.

abicky added a commit to abicky/fluentd that referenced this issue Jun 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants