Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor internet connection breaks wdpa_fetch #39

Closed
PhDyellow opened this issue Oct 27, 2021 · 9 comments
Closed

Poor internet connection breaks wdpa_fetch #39

PhDyellow opened this issue Oct 27, 2021 · 9 comments

Comments

@PhDyellow
Copy link

I am using version 1.3.1.3.

When I called aus_mpa <- wdpa_fetch("Australia", wait = TRUE), I got the following error:

Error: 	 Summary: NoSuchElement
 	 Detail: An element could not be located on the page using the given search parameters.
 	 class: org.openqa.selenium.NoSuchElementException
	 Further Details: run errorDetails method

I am aware of #35, so I made sure to use a recent github version of wdpar. The issue went away when I switched to a faster, non-mobile, connection.

Reading around, it seems like PhantomJS doesn't have a good way to know if the page has loaded yet, would it make sense to parameterize the sleep timer in wdpa_url instead?

@jeffreyhanson
Copy link
Collaborator

jeffreyhanson commented Oct 27, 2021

Thanks for reporting this and verifying that this is issue is a problem with the latest GitHub version. Yeah I think that would work. Just to check I understand correctly, you're suggesting we add a new parameter (e.g. called sleep_duration, maybe you can think of something better? I'm horrible at naming things...) to wdpa_fetch and wdpa_url so that users can control how long the web driver waits for pages to load? So, in your case, you could set the new parameter to something like 5 seconds so it has more time to load the protected planet web pages over a slow internet connection?

@PhDyellow
Copy link
Author

That's right, the sleep_duration you described is what I'm suggesting, and I can't think of a better name either : P

page_load_wait_time?

It might be worth warning the user in the documentation that the delay occurs ~4 times per country code, so a 5 second sleep time means 20 seconds per country.

Overall, I'm impressed with the package, it saves me a lot of time getting MPA polygons into R!

@jeffreyhanson
Copy link
Collaborator

That's right, the sleep_duration you described is what I'm suggesting, and I can't think of a better name either : P

Ok great! Yeah, I feel like page_load_wait_time is more descriptive, but a bit long. What about just wait_time?

page_load_wait_time?

Yeah, I feel like page_load_wait_time is more descriptive, but a bit long. What about just wait_time?

It might be worth warning the user in the documentation that the delay occurs ~4 times per country code, so a 5 second sleep time means 20 seconds per country.

Good point - I'll add that to the documentation

Overall, I'm impressed with the package, it saves me a lot of time getting MPA polygons into R!

Thanks! How urgently do you need this? I've got other stuff I need to focus on this week - but I can try and get this into the GitHub version over the weekend? If you need it urgently ASAP, I suggest forking the repo and manually changing the sleep durations.

@jeffreyhanson
Copy link
Collaborator

PS. there's currently a bug in importing the full global dataset. So if you need that, try the fix-global branch.

@PhDyellow
Copy link
Author

PhDyellow commented Oct 28, 2021

How urgently do you need this?

I think I can work around the bug manually by copying the pre-downloaded data to the right folder, so I can wait until you get the fix into the repo. I won't need the global data for now, but if any of my colleagues ask about it, I'll know where to point them.

What about just wait_time?

wait_time sounds good, as long as it doesn't get confused with the wait parameter that's already in there.

EDIT: fix comment formatting

@jeffreyhanson
Copy link
Collaborator

Ok,great - thanks!

Ah - that's an excellent point. Yeah, I totally forgot about about the wait parameter. How about page_wait_time?

jeffreyhanson added a commit that referenced this issue Oct 29, 2021
@PhDyellow
Copy link
Author

I suggest page_wait: not too long, and starting with page gives some contrast to the wait parameter.

In the end though, I think any of the ideas so far are fine. Thanks for putting together the fix quickly!

@jeffreyhanson
Copy link
Collaborator

jeffreyhanson commented Oct 30, 2021

Ok - sounds good! I'll try and get that merged into the main branch tonight this weekend (I've added it to PR #38 which I'm working on to get the cleaning process working for the global dataset)

@jeffreyhanson
Copy link
Collaborator

I've just merged the PR - so if you install from the main branch, there's the new page_wait parameter to control sleep times. Let me know if you run into any issues with it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants