Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic :badarg error #52

Closed
jvoegele opened this issue Jun 12, 2019 · 18 comments
Closed

Sporadic :badarg error #52

jvoegele opened this issue Jun 12, 2019 · 18 comments
Labels

Comments

@jvoegele
Copy link

jvoegele commented Jun 12, 2019

I'm experimenting with Nebulex multi-level caching with the following structure:

defmodule Hodor.Cache do
  use Nebulex.Cache,
    otp_app: :hodor,
    adapter: Nebulex.Adapters.Multilevel

  defmodule L1 do
    use Nebulex.Cache,
      otp_app: :hodor,
      adapter: Nebulex.Adapters.Local
  end

  defmodule L2 do
    use Nebulex.Cache,
      otp_app: :hodor,
      adapter: NebulexRedisAdapter
  end
end

and the following configuration:

config :hodor, Hodor.Cache,
  levels: [
    Hodor.Cache.L1,
    Hodor.Cache.L2
  ]

config :hodor, Hodor.Cache.L1,
  gc_interval: 5 * 60

config :hodor, Hodor.Cache.L2,
  mode: :standalone,
  conn_opts: [
    # Redis connection parameters are injected here by Hodor.Application.start
  ]

This configuration seems to work fine for the most part, but I get occasional error messages in my console that look like this:

iex(Hodor@Jasons-MacBook-Pro)25> 16:39:05.551 pid=<0.660.0> [error] GenServer Hodor.Cache.L1.Generation terminating
** (ArgumentError) argument error: {:badarg, [{:ets, :new, [:"Elixir.Hodor.Cache.L1.0", [:set, :named_table, :public, {:read_concurrency, true}]], []}, {:shards_owner_sup, :init, 1, [file: '/Users/jvoegele/work/hodor/deps/shards/src/shards_owner_sup.erl', line: 60]}, {:supervisor, :init, 1, [file: 'supervisor.erl', line: 295]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 374]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 342]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}
    (shards) /Users/jvoegele/work/hodor/deps/shards/src/shards_local.erl:897: :shards_local.do_new/3
    (nebulex) lib/nebulex/adapters/local/generation.ex:130: Nebulex.Adapters.Local.Generation.new_gen/2
    (nebulex) lib/nebulex/adapters/local/generation.ex:105: Nebulex.Adapters.Local.Generation.handle_info/2
    (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: :timeout
State: %{cache: Hodor.Cache.L1, gc_interval: 300, gen_index: 0, time_ref: {-576459550140147, #Reference<0.4266848188.3638034439.1982>}}

Any idea what the issue might be?

@cabol
Copy link
Owner

cabol commented Jun 12, 2019

Let me try to reproduce the error to see what could be happening.

@cabol
Copy link
Owner

cabol commented Jun 13, 2019

I've been trying to reproduce the error but nothing, perhaps you should give me more details about how are you testing it, did you run the app with the cache and then executing some commands via console?

The error itself is because the gc_collector is trying to create a new generation with a name that already exists, which should have been removed, it is very weird. Is there a chance you can provide a link with the example you are using to reproduce the error?

@jvoegele
Copy link
Author

Thanks for looking into this. Unfortunately I cannot share a link to the codebase in which I'm experiencing this issue, as it is a private application that relies on several other private applications and libraries to do its work. However, through some further experimentation, I have found a simple way to reproduce the issue consistently. All I need do is, in the console, call the flush() function on the cache, either on the level 1 cache (Hodor.Cache.L1.flush()) or on the multi-level cache (Hodor.Cache.flush()). Then, on the next gc_interval, I will get the error message shown in the original issue description above.

Can you try to reproduce by setting up a multi-level cache as described above, then calling flush() on the L1 cache?

Thanks!

@cabol
Copy link
Owner

cabol commented Jun 13, 2019

Yes, absolutely, that's enough info, I'll try, thanks :) !!

@cabol
Copy link
Owner

cabol commented Jun 13, 2019

I found the issue, just pushed the fix to master. Please try it out and let me know if it works. Stay tuned!

@jvoegele
Copy link
Author

Awesome! Thank you very much.

@jvoegele
Copy link
Author

I tested the fix on master and all is good! Thanks again.

@cabol
Copy link
Owner

cabol commented Jun 13, 2019

Awesome! I will close the issue then! Thanks!

@cabol cabol closed this as completed Jun 13, 2019
@hugomatinho
Copy link

We are getting the same behaviour in the latest version 1.1.1

with the same way of reproducing

call the flush() function on the cache, either on the level 1 cache (Cache.L1.flush()) or on the multi-level cache (Cache.flush()).

consistently outputs the error

Note: it does clear the cache but still outputs the error

15:01:28.184 [error] ** Task #PID<0.908.0> terminating ** Started from Amphora.Cache.Olympus.Generation ** When function == #Function<17.19219966/0 in :shards_local.p_mapred/4> ** arguments == [] ** Reason for termination == ** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.2", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/rodribastos/code/rodrigoruas/bboxsports/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/rodribastos/code/rodrigoruas/bboxsports/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]} ** (exit) exited in: GenServer.call(Amphora.Cache.Olympus.Generation, :flush, 5000) ** (EXIT) an exception was raised: ** (ArgumentError) argument error (stdlib) :ets.internal_delete_all(:"Elixir.Amphora.Cache.Olympus.0.2", :undefined) (stdlib) ets.erl:122: :ets.delete_all_objects/1 (shards) /Users/rodribastos/code/rodrigoruas/bboxsports/amphora/deps/shards/src/shards_task.erl:251: :shards_task.do_apply/2 (shards) /Users/rodribastos/code/rodrigoruas/bboxsports/amphora/deps/shards/src/shards_task.erl:233: :shards_task.reply/3 (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3 (elixir) lib/gen_server.ex:1009: GenServer.call/3

@cabol
Copy link
Owner

cabol commented Feb 12, 2020

I'll take a look at it, thanks!

@cabol cabol reopened this Feb 12, 2020
@cabol
Copy link
Owner

cabol commented Feb 12, 2020

@hugomatinho trying to reproduce the error using near_cache example which uses the multi-level adapter but it works (calling NearCache.flush(), NearCache.L1.flush(), etc., work). Could you try to reproduce the error with the example or maybe if you could provide some code in order to reproduce or it would be helpful.

@hugomatinho
Copy link

hugomatinho commented Feb 13, 2020

Still have error even though config looks fine according to example
config.exs

config :amphora, Amphora.Cache,
    levels: [
      Amphora.Cache.Olympus,
      Amphora.Cache.Hestia,
    ],
    cache_model: :exclusive

  #L1  
  config :amphora, Amphora.Cache.Olympus,
    n_shards: 2,
    gc_interval: 3600,
    local:  Amphora.Cache.Local.Olympus

  config :amphora, Amphora.Cache.Hestia,
    n_shards: 2,
    gc_interval: 3600,
    local:  Amphora.Cache.Local.Hestia
  
  #Local
  config :amphora, Amphora.Cache.Local.Olympus,
    n_shards: 2,
    gc_interval: 3600

  config :amphora, Amphora.Cache.Local.Hestia,
    n_shards: 2,
    gc_interval: 3600`

Amphora.Cache module Cache definition

defmodule Amphora.Cache do
  @moduledoc false
  use Nebulex.Cache,
    otp_app: :amphora,
    adapter: Nebulex.Adapters.Multilevel

    use Amphora.Cache.Hooks

    defmodule Olympus do
      use Nebulex.Cache,
      otp_app: :amphora,
      adapter: Nebulex.Adapters.Local
    end

    defmodule Hestia do
      use Nebulex.Cache,
      otp_app: :amphora,
      adapter: Nebulex.Adapters.Local
    end

  end

Application.ex

defmodule Amphora.Application do
  @moduledoc false

  use Application

  def start(_type, _args) do
    children = [
      # Starts a worker by calling: Amphora.Worker.start_link(arg)
      # {Amphora.Worker, arg}
      Amphora.Cache.Local.Olympus,
      Amphora.Cache.Local.Hestia,
      Amphora.Cache.Olympus,
      Amphora.Cache.Hestia,
      Amphora.Cache,
      Amphora.Consumers.Olympus,
      {Plug.Cowboy,
       scheme: :http, plug: Services.Main, options: [port: 7000], restart: :transient}
    ]

    opts = [strategy: :one_for_one, name: Amphora.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

Cache population

defmodule Amphora.Consumers.Olympus do
  @moduledoc false
  require Logger
  use Broadway
  alias Broadway.Message

  def start_link(_opts  \\ []) do
    Broadway.start_link(__MODULE__,
      name: __MODULE__,
      producer: [
        module: {BroadwaySQS.Producer, queue_url: "https://<amazon_sqs_url_here>}
      ],
      processors: [
        default: []
      ],
      batchers: [
        default: [
          batch_size: 1,
          batch_timeout: 2000
        ]
      ]
    )
  end

  @impl true
  def handle_message(_, %Message{data: _data} = message, _) do
    persist(message)
  end

  @impl true
  def handle_batch(_, messages, _, _) do
    messages
  end

  defp persist(message) do

    content = message.data |> Jason.decode!
    data =  Jason.decode!(content["Message"])
    map = %{
      id: message.metadata.message_id,
      subject: content["Subject"],
      data: data
    }
    decoded_msg = map.data |> Utils.Helpers.convert_to_atom_map
    response = Amphora.Cache.Olympus.set(map.subject <> ":" <> map.id, decoded_msg, return: :object, on_conflict: :override)
    Logger.debug("response: #{inspect(response, pretty: true, limit: :infinity)}")
    message
  end
end

terminal output with the filled cache
iex(1)> Amphora.Cache.all ["league:afd67adf-eb58-482d-ab4c-dd10c48ea535", "league:ab168293-f765-4030-a90c-5dda4dc38d11", "league:8bc623b6-75ac-486e-832b-09281d51d7a2", "league:fb93c513-3e0a-44da-aba6-9d4fe756324a", "league:320bf6e6-c2e6-450c-8fc9-5074983fbe2d", "league:cfb8f8b6-99df-4f77-a58b-922317b770ad", "league:7c7aecec-0254-4170-a528-823698f658b6", "league:ae546340-e35c-43f2-8e68-364b4336b99d", "league:f78971e7-5c01-40e6-9c0c-0841eb4d1937", "league:1bf1b9b8-082e-424a-8185-e8455a9fc259", "league:cd55d435-6967-41f6-985a-22ac296d14a7", "league:0af3ef5a-d9c4-4e04-b525-c525746ae1de", "league:aff2e770-79c6-428d-8211-939643c1ce42", "league:273362b8-84cc-4b5c-beaa-7f0b18c40571", "league:ab75a15a-e7db-4b75-998e-52d31116321d", "league:95e0cb86-3543-41e8-bf2b-dfcb8afb9e75", "league:d5ef72a3-6f80-48b3-8a49-43c6c11dc392", "league:fca10ba6-6975-4fb3-83c2-05ada1b4ed3c", "league:4b68f082-46de-4bcc-864d-1249129c80a2", "league:6fb39d89-4008-4770-a711-b39a438dd44e", "league:d0e6072b-f43e-45ab-b929-f8da37818e30", "league:48b15e84-d7d4-4923-bc13-9a490a5b24f3", "league:048eb35a-5918-464b-a5ec-3b99e9ba2ed5", "league:363c8ca6-7b0a-4f31-bdd7-865567537a37", "league:998f1894-d8cd-4583-9e90-9278ecc6b7d4", "league:77fda692-4fe7-40a0-9481-d12b53bf9e35", "league:fb2841da-49ac-4f37-9473-a75c2cab4185", "league:e5aee78f-255a-409f-954f-effa7cf97727", "league:566b99b8-eda1-49f2-a160-a19a4e3eb71f", "league:75b322ec-0527-485f-b35b-22709780611d", "league:8ecdf655-4767-47c6-81ea-24a1c6dd0920", "league:024b35a0-dd04-43c8-b42c-94641be050f0", "league:3ed4f4c0-ad47-4602-b8a2-aad998fd730c", "league:e34a0415-2a69-4da0-ade0-faddf69837af", "league:5de32b40-2559-4a7b-832e-cb323112a797", "league:2f67b7d0-8cd1-421b-b69f-73da3fa21994", "league:b8c3876a-a7bc-4145-96b9-795afc8f024f", "league:2a477060-1986-4bad-89dd-bea025075df6", "league:cf655a76-bec0-4fa2-a4d3-f4dd0193b43e", "league:d653379d-2567-488f-88e0-2becf463ed7a", "league:c3053aa3-0733-4b5a-9197-47358b654c59", "league:5a59eefb-471f-41f9-bdaa-f6dfef878d37", "league:c28de36e-4e7b-4041-adef-debfad038d2d", "league:4a14564c-e889-4c09-a124-dbe9605707f4", "league:164887d8-7676-4c41-9d32-6111ec22bf7f"]

terminal output on error

iex(2)> Amphora.Cache.flush

12:11:39.968 [error] ** Task #PID<0.786.0> terminating
** Started from Amphora.Cache.Olympus.Generation
** When function  == #Function<17.19219966/0 in :shards_local.p_mapred/4>
**      arguments == []
** Reason for termination == 
** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.5", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}

 
12:11:39.969 [error] ** Task #PID<0.785.0> terminating
** Started from Amphora.Cache.Olympus.Generation
** When function  == #Function<17.19219966/0 in :shards_local.p_mapred/4>
**      arguments == []
** Reason for termination == 
** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.4", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}


12:11:39.969 [error] ** Task #PID<0.783.0> terminating
** Started from Amphora.Cache.Olympus.Generation
** When function  == #Function<17.19219966/0 in :shards_local.p_mapred/4>
**      arguments == []
** Reason for termination == 
** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.2", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}

 
12:11:39.969 [error] ** Task #PID<0.784.0> terminating
** Started from Amphora.Cache.Olympus.Generation
** When function  == #Function<17.19219966/0 in :shards_local.p_mapred/4>
**      arguments == []
** Reason for termination == 
** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.3", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}


12:11:39.969 [error] ** Task #PID<0.787.0> terminating
** Started from Amphora.Cache.Olympus.Generation
** When function  == #Function<17.19219966/0 in :shards_local.p_mapred/4>
**      arguments == []
** Reason for termination == 
** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.6", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}


12:11:39.969 [error] ** Task #PID<0.788.0> terminating
** Started from Amphora.Cache.Olympus.Generation
** When function  == #Function<17.19219966/0 in :shards_local.p_mapred/4>
**      arguments == []
** Reason for termination == 
** {:badarg, [{:ets, :internal_delete_all, [:"Elixir.Amphora.Cache.Olympus.0.7", :undefined], []}, {:ets, :delete_all_objects, 1, [file: 'ets.erl', line: 122]}, {:shards_task, :do_apply, 2, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 251]}, {:shards_task, :reply, 3, [file: '/Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl', line: 233]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}

** (exit) exited in: GenServer.call(Amphora.Cache.Olympus.Generation, :flush, 5000)
    ** (EXIT) an exception was raised:
        ** (ArgumentError) argument error
            (stdlib 3.11.2) :ets.internal_delete_all(:"Elixir.Amphora.Cache.Olympus.0.3", :undefined)
            (stdlib 3.11.2) ets.erl:122: :ets.delete_all_objects/1
            (shards 0.6.1) /Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl:251: :shards_task.do_apply/2
            (shards 0.6.1) /Users/hugo/Documents/Work/amphora/deps/shards/src/shards_task.erl:233: :shards_task.reply/3
            (stdlib 3.11.2) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
    (elixir 1.10.1) lib/gen_server.ex:1023: GenServer.call/3
    (elixir 1.10.1) lib/enum.ex:783: Enum."-each/2-lists^foreach/1-0-"/2
    (elixir 1.10.1) lib/enum.ex:783: Enum.each/2
    (amphora 0.1.0) lib/amphora/cache/amphora.ex:3: Amphora.Cache.with_hooks/3

@hugomatinho
Copy link

@cabol any idea regarding the above ?

@cabol
Copy link
Owner

cabol commented Feb 18, 2020

Hey @hugomatinho , sorry, I haven't had time to look into it. I will try to check it out as soon as I can, hopefully during this week but for sure this weekend. Thanks and I'll get back to you soon!

@cabol
Copy link
Owner

cabol commented Feb 23, 2020

Hey, @hugomatinho good news, I found the bug, thank you so much for reporting it, I will push a fix today, I'll let you know when it is ready!

cabol added a commit that referenced this issue Feb 23, 2020
Overall enhancements
cabol added a commit that referenced this issue Feb 23, 2020
Overall enhancements
@cabol
Copy link
Owner

cabol commented Feb 23, 2020

@hugomatinho I just pushed the fix on the master branch, in the meantime, you'll need to point there – I'm aiming to release v1.2.0 very soon.

On the other hand, checking out your config, I don't understand why or how you are using this:

#Local
config :amphora, Amphora.Cache.Local.Olympus,
  n_shards: 2,
  gc_interval: 3600

config :amphora, Amphora.Cache.Local.Hestia,
  n_shards: 2,
  gc_interval: 3600`

For the 2-level cache you want, you just need:

config :amphora, Amphora.Cache,
  levels: [
    Amphora.Cache.Olympus,
    Amphora.Cache.Hestia,
  ],
  cache_model: :exclusive

#L1  
config :amphora, Amphora.Cache.Olympus,
  n_shards: 2,
  gc_interval: 3600

config :amphora, Amphora.Cache.Hestia,
  n_shards: 2,
  gc_interval: 3600

You don't need to add the local: option there, it is needed if you use the partitioned adapter in order to specify which will be your local cache or backend, but since you are using only the local adapter, it is not necessary.

Let me know if now it works for you, stay tuned!

Thanks 👍

@cabol
Copy link
Owner

cabol commented Feb 25, 2020

Hey, any update on this? did it work for you?

@hugomatinho
Copy link

@cabol just tested it, it worked! awesome! thank you so much :) i'll have a look at the fix so i can learn as well

@cabol cabol closed this as completed Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants