Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"KeyError: key ([#],[#]) not found" #260

Closed
slamander opened this issue Sep 1, 2020 · 18 comments
Closed

"KeyError: key ([#],[#]) not found" #260

slamander opened this issue Sep 1, 2020 · 18 comments

Comments

@slamander
Copy link

Hello, CS community!

As a continuation from issue #258, I've encountered another CS error for which I do not know how to address. See error message pasted below:

The relevant files (nodes & resistance layers, init file, and included pair list) are here.

Error: Error happens in Julia.
On worker 3:
KeyError: key (127, 121) not found
getindex at .\dict.jl:467
f at C:\Users\jbaecher\.julia\packages\Circuitscape\XdJRQ\src\core.jl:159
#89 at C:\Users\jbaecher\.julia\packages\Circuitscape\XdJRQ\src\core.jl:223
#106 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:294
run_work_thunk at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:79
macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:294 [inlined]
#105 at .\task.jl:356
Stacktrace:
 [1] (::Base.var"#770#772")(::Task) at .\asyncmap.jl:178
 [2] foreach(::Base.var"#770#772", ::Array{Any,1}) at .\abstractarray.jl:2009
 [3] maptwice(::Function, ::Channel{Any}, ::Array{Any,1}, ::UnitRange{Int64}) at .\asyncmap.jl:178
 [4] wrap_n_exec_twice(::Channel{Any}, ::Array{Any,1}, ::Distributed.var"#206#209"{Distributed.Worker

Thanks in advance for any help or explanation.

Best,

-Alex.

@vlandau
Copy link
Member

vlandau commented Sep 1, 2020

I'm able to reproduce this. This seems to be an issue with parallel processing plus include_pairs. It did run properly in serial, but I needed to restart Julia to get it to work. For some reason, if you try to run in parallel and get this error, running in serial doesn't work until you restart Julia. Very strange (concerning) behavior.

Summary of observations:

  • When not using include_pairs, in serial or parallel, there are no errors.
  • When using include_pairs but running in serial, no errors (unless you just tried and failed to run in parallel with include_pairs).
  • When using include_pairs and parallel processing, I get the error in the original post above.

@slamander for now, because your resistance grid is so small, I would suggest running in serial (it is extremely fast) by setting parallelize = false in your .ini. Once you update your .ini file, restart Julia and reload Circuitscape. Also, I would recommend using solver = cholmod in your .ini for problems of this size, < 2 million pixels, as it is much faster that cg+amg (about 10x faster -- it took around 2 seconds to run with CHOLMOD).

@ranjanan is probably the person to look at this (I think he's dealt with some similar issues in the past). I'm just not familiar with the portions of the code that handle parallel processing or include_pairs. I know @ranjanan is very busy these days so he may not have time to look at this for while.

@slamander
Copy link
Author

Thank again, @vlandau.

These are all very helpful notes, and--as you mentioned--since this is a very small job (although it is one in several hundred grids I'm applying this too), I'm more than satisfied with this solution.

Stay well out there,

-Alex.

@vlandau vlandau reopened this Sep 2, 2020
@vlandau
Copy link
Member

vlandau commented Sep 2, 2020

I'm leaving this open just so we can properly debug, but glad it's workable for you for now at least!

@ViralBShah
Copy link
Member

@vlandau We may want to automatically pick cholmod for small problem sizes, and switch to cg+amg on larger ones. What do you think?

@vlandau
Copy link
Member

vlandau commented Sep 2, 2020

@ViralBShah I like that idea. I'm planning to do something similar for Omniscape -- once CHOLMOD gets implemented for advanced mode :) (or once the PARDISO solver gets released I might use that if it's comparable in performance)

@ViralBShah
Copy link
Member

Pardiso wasn't giving better enough performance, from what I could tell in the open PR. Unless you find it does on certain problems, I would suggest sticking with cholmod.

@ranjanan
Copy link
Member

ranjanan commented Sep 4, 2020

@slamander the include pairs file is missing from your link (CMR_CAF.txt). Could you please add that to your Google Drive link?

@slamander
Copy link
Author

Hi, @ranjanan. Thanks for the help! Sorry, I made the mistake of altering the files in this folder while troubleshooting upstream aspects of my workflow... If my alterations produce the same error, I'll be sure to post again. If this same error doesn't throw in my updated workflow, I'll try to work backward to recreate it.

In the future, I'll be sure to preserve the code and files for recreating issues when posted here.

@ranjanan
Copy link
Member

ranjanan commented Sep 4, 2020

Alright, no problem. Would you like to reopen this issue if you encounter it again?

@slamander
Copy link
Author

Hey, @ranjanan & @ViralBShah. Sorry, I must not have hit 'comment' when I tried to respond earlier. That sounds like a good idea. I'll let you both know what happens after I complete my workflow.

@vlandau
Copy link
Member

vlandau commented Sep 4, 2020

I believe I still have the files to reproduce. I can post them and reopen if it's okay with @slamander .

@slamander
Copy link
Author

@vlandau Fine by me!

@vlandau
Copy link
Member

vlandau commented Sep 4, 2020

AZE_ARM_CS_inputs.zip
Here are the files @ranjanan.

@vlandau vlandau reopened this Sep 4, 2020
@ranjanan
Copy link
Member

ranjanan commented Sep 6, 2020

@vlandau did you miss uploading the resistance surface? ERROR: the file "resistance_AZE_ARM.asc" does not exist

@vlandau
Copy link
Member

vlandau commented Sep 8, 2020

Ah shoot, sorry. I uploaded the .out file instead of the resistance surface.... this archive should contain everything.
AZE_AR_CS_inputs_v2.zip

@ranjanan
Copy link
Member

I don't see this error anymore here, and I can see an output. @vlandau if you get a minute, could you try CS on the latest version of these files to see if the output is correct?

@slamander
Copy link
Author

Not sure if this is relevant, but I was running Julia through R and needed to reconfigure my interfacing library to alleviate this issue. I can try to provide more details if necessary.

@ranjanan
Copy link
Member

@slamander nice to hear this issue is alleviated. Can you elaborate on the changes? Did you update your R and Julia and JuliaCall?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants