Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel exception: usage_request #1091

Open
marius311 opened this issue Aug 16, 2023 · 8 comments
Open

Kernel exception: usage_request #1091

marius311 opened this issue Aug 16, 2023 · 8 comments

Comments

@marius311
Copy link
Contributor

On some particular cluster's Jupyterlab, I keep geting these messages periodically in random cells:

KERNEL EXCEPTION
KeyError: key "usage_request" not found

Stacktrace:
 [1] getindex(h::Dict{String, Function}, key::String)
   @ Base ./dict.jl:484
 [2] eventloop(socket::ZMQ.Socket)
   @ IJulia ~/.julia/packages/IJulia/6TIq1/src/eventloop.jl:8
 [3] (::IJulia.var"#14#17")()
   @ IJulia ./task.jl:514

Julia installed from binaries, IJulia v1.24.0, Jupyterlab 3.6.5

Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 128 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, znver3)
  Threads: 129 on 128 virtual cores
Environment:
  LD_LIBRARY_PATH = /global/u1/m/marius/.julia/artifacts/fac7e6d8fc4c5775bf5118ab494120d2a0db4d64/lib:/global/u1/m/marius/.julia/artifacts/7661e5a9aa217ce3c468389d834a4fb43b0911e8/lib:/global/u1/m/marius/.julia/artifacts/d00220164876dea2cb19993200662745eed5e2db/lib:/global/u1/m/marius/.julia/juliaup/julia-1.9.2+0.x64.linux.gnu/bin/../lib/julia:/global/u1/m/marius/.julia/artifacts/49d9387d0153ebcfb578e03cd2c58ddff2ef980b/lib:/global/u1/m/marius/.julia/artifacts/416d108e827d01dce771c4cbee18f8dcff37a3b3/lib:/global/u1/m/marius/.julia/artifacts/51cb236ffdb7e1ed1b9d44f14c81f2b84bc46520/lib:/global/u1/m/marius/.julia/juliaup/julia-1.9.2+0.x64.linux.gnu/bin/../lib/julia:/global/u1/m/marius/.julia/juliaup/julia-1.9.2+0.x64.linux.gnu/bin/../lib:/opt/cray/pe/python/3.9.13.1/lib:/opt/cray/pe/gcc-libs:/global/u1/m/marius/lib:/opt/cray/pe/papi/7.0.0.1/lib64:/opt/cray/pe/gcc/11.2.0/snos/lib64:/opt/cray/libfabric/1.15.2.0/lib64:/opt/cray/libfabric/1.15.2.0/lib64
  JULIA_MPI_TRANSPORT = TCP
  JULIA_REVISE_POLL = 1
  JULIA_MPI_PATH = 
  JULIA_PROJECT = @.
  JULIA_MPIEXEC = srun
  JULIA_NO_VERIFY_HOSTS = github.com
  JULIA_NUM_THREADS = 1
  JULIA_PYTHONCALL_EXE = /global/u1/m/marius/.cache/pypoetry/virtualenvs/muse-3g-lO5pQ0cA-py3.8/bin/python
  JULIA_CONDAPKG_BACKEND = Null
@stevengj
Copy link
Member

stevengj commented Aug 16, 2023

Hmm, I don't see usage_request in the Jupyter protocol docs. @minrk, is this something new?

There was a PR to add a usage_request message, but it was apparently rejected: jupyterlab/jupyterlab#11285

Looks like it might be a kernel extension? https://github.com/jupyter-server/jupyter-resource-usage

I guess we should simply ignore messages we don't understand?

@stevengj
Copy link
Member

For example, we could replace

invokelatest(handlers[msg.header["msg_type"]], socket, msg)

with

invokelatest(get(handlers, msg.header["msg_type"], default_handler), socket, msg)

where default_handler does nothing.

@stevengj
Copy link
Member

stevengj commented Aug 16, 2023

We could additionally add support for the https://github.com/jupyter-server/jupyter-resource-usage extension, I guess, similar to how it is handled in ipykernel: https://github.com/ipython/ipykernel/blob/6bf3fe9e44f1caf4bc371f700ac0c2e9c9c3bd84/ipykernel/kernelbase.py#L996-L1027

Though that would seem to require a Julia equivalent of the Python psutil package, and I'm not sure that exists?

@marius311
Copy link
Contributor Author

marius311 commented Aug 16, 2023

Don't know how these things work but as a first thing maybe just output it to the Jupyter log instead of the cell stderr? (or ofc your suggestion of do nothing) I don't actually use that extension, although looks like the cluster does comes with it preinstalled so that seems like a good theory its that.

@JBlaschke
Copy link

Some more context: we see this on jupyter.nersc.gov -- It's intermittent, and seems to vary over time. Due to this, I thought it might be related to other issues we're having at NERSC. However, based in this new information, I am going to revise my theory:

It's possible that this kernel extension was added/changed recently. It makes sense to have something like this on shared nodes. I therefore suggest that:

  • For the short term we go with @stevengj 's suggestion and ignore messages we don't understand (or pop them into the logs)
  • I would really like Julia to be a good citizen, so for the long term, I encourage adding the capability to the IJulia kernel to return usage data. I am happy to help with that and test on jupyter.nersc.gov

Ping'ing @rcthomas

@stevengj
Copy link
Member

#1092 should work around the immediate issue by ignoring unknown requests.

Responding to usage_data requests properly is much more difficult since we don't have the analogue of psutil, and pulling out the relevant information/code seems to be quite complicated to do cross-platform (giampaolo/psutil#2296).

@ytdHuang
Copy link

May I ask whether there will be a patch release soon for solving this KERNEL EXCEPTION ?
The error messages keeps popping out in the jupyter-lab, which is a bit annoying lol.

@etejedor
Copy link

Is #1092 ready to be merged? It would be nice to have a patch release that at least silences these messages. Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants