-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with DNS resolution #3919
Comments
This is a fairly generic networking error with a lot of routine reasons it can happen. I suggest asking on stackoverflow where you will get a broader set of eyes. It's unlikely it is specifically an OkHttp error. Feel free to drop your Stackoverflow question here as a reply. |
I'm not sure if this is a generic networking error. I always saw this a lot on fabric but I wasn't able to debug, but now something weird just happened with a user. How's that possible? |
Did you ever get anywhere with this issue @renatopeterman? I have the same issue, it's one of our biggest issues coming in via Fabric and we are unable to reproduce it. |
Hello fellows, have anybody got a solution to this issue? I've the same situation, this is one of our biggest issues too. I will appreciate if someone let me know how solved this thing. Regards |
You can try using an implementation of the DNS interface that falls back to Google DNS. |
Hi everyone, was anyone able to figure out a solution for this? We are facing same issues with our app |
check your device is opened wifi? |
@jonbulica99 - did you ever figure out what was going on? We're getting the same issue and it is basically bricking our app for some android users 😰 , mostly samsung |
Do you have first hand experience with it happening when it definitely shouldn't? This looks like the behaviour you get on newer Android versions when backgrounded and doze mode is on etc. Could these errors be logged by your app while backgrounded and not allowed access to the network? But then testing with a browser (by definition in foreground) the request will succeed? |
@yschimke - I haven't been able to repro on my own. We're just getting reports of some users failing to access our app, and we have a retry screen. Hitting retry (which re-submits the requests) and just restarting the app both don't resurrect their session immediately. One user went and changed his notification settings, and then started the app. For some reason issue went away permanently for him, and then changing the notification settings back to their old status didn't recreate the issue, so it's unclear if that actually did anything. |
Thanks, that's useful. But I don't have a good answer for you at the moment. |
Appreciate the suggestion re: doze mode, I'm going to test this on some Samsung devices and see if I can repro. |
Airplane mode fails this way also, but then would affect all apps. If you have a crash reporter app, it would be interesting to see what else correlates with the errors.
|
See #3974 (comment) |
FWIW adb enhanced is useful for testing similar OS modes https://github.com/ashishb/adb-enhanced |
Thanks for the reference, will research further. We know users experiencing this issue aren't on airplane mode from talking to them After surveying a few of the users, it appears the user ONLY happens on wifi. Our app works fine on cellular data. More details on our issue
~80% of our app is in React Native, on v0.62.2. We got this issue after upgrading from 0.61.5 to 0.62.2, but that might be unrelated. We also have a native part of our app where we directly use retrofit and okhttp. We use retrofit 2.8.0 so that sets our okhttp to v3.14.7. In an attempt to fix, we have pinned retrofit to Finally, the only adjustment we really make to okhttp in our app:
This is because we were seeing that requests to multipart uploads were queuing up for retry automatically on failure. Later we would sometimes see these uploads get retried, and it would overwhelm our services sometimes with hundreds of uploads with the same image. Here are a few of our issues we have seen coming in through our monitoring.
|
For our app, issues are happening primarily on Samsung S9 and Note 10 with Android 10, only on Wifi. For what it's worth, I'm starting to believe it may be the same as these reports, might not be specific to Okhttp |
Hello. @wildseansy |
Do you have any contextual information
|
I never figured it out completely. It seems from what I have seen, when a user doesn't have a connection, it corrupts the dns record as unreachable and mostly on samsung
|
If you are particularly motivated you could experiment (A/B test) with Dns over HTTPs (okhttp-doh) as a fall back? Maybe try system dns first, and then follow up with dns over https if it fails, see whether you get results in that case. But there isn't a good fix from OkHttp as by default we rely on |
I have the same error on my personal Huawei P30 pro VOG-L29 Android 10.1.0 Build 10.1.0.161(c431e23r2p5) both on WIFI and on mobile Data aswell. Turning phone completely off and on didn't fix it.
It is strange that it tries to resolve that address over dns with ipv6 implementation honestly. The related nameservers are cloudflare ones. The base domain
It could be a subdomain revolving issue? I suppose android native programming is "as stupid as windows nslookup" while resolving the host from a specific version onward? Maybe the browser implementation somehow get's around this and uses the ip address of the top domain if the subdomain fails? |
Hey there. This seems to be a good aproach! After two hours i got this:
Now the flow is:
Now the call would be transformed to: However this throws SSL exception for obvious reasons (we just replaced the host and the certificate doesn't match to a floating ip!). Now i am sure this was the problem for my issue. Since removing SSL Checks is not recommended do you guys have another idea how to make this work? |
Update: It could be that cloudflare simply blocks me??? Now i can't open bot.whatismyipaddress.com in the browser aswell... |
Update2: I changed the DNS resolver to system default again and tried with different URLs:
The same issue occurs with checkip.dyndns.org which is also hosted on cloudflare. However the ipinfo.io works and is hosted on Google Domains according to Whois-Lookup! I Suppose there is an issue that cloudflare is simply blocking native calls from android devices, aka the admin of the website is to strict. Since there is also an issue in DNS resolving these cloudflare domains it could be some kind of active DDOS-Protection mechanisms filtering calls that look "to basic"? I think this is more an webserver admin issue we need to work around... |
For our users, this is happening quite frequently and we can't ask them to reconnect their Wifi or some hacks. For an instance,
This issue is with Firebase API but the same also happens with our own domains. |
why this isse is closed? |
We don't have a clean repro, and there seems to be some (circumstantial) evidence in #3919 (comment) that it appears outside OkHttp. Do you have a simple repro we can investigate? |
It's not just an issue with OkHttp, I just ripped out OkHttp networking and replaced it with the The good news is that this library is much much better at resolving the hostname from the DNS. The bad news is that the DNS resolution still fails sometimes. My speculation is that some low level network state is borked and the Update: It's not. Dns still corrupts on Http Url as well. |
It's possible that stack is better at resolving DNS for the current network as that changes. But that's a guess. |
Android has this problem but iOS works always fine. |
Is anyone here able to test with a build variant with an alternative OkHttp DNS impl? |
Yes I switched it up to the native http library and got the same DNS corruption issues. |
Even though we can't really reproduce, is there at least a way to prevent these "crashes" from showing up on Firebase? |
I have received some Stack traces from our cusomers, and as it looks in our case it happens when the app is in the background and trying to make very first request to server on cold start. (It happens when we receive the push from firebase and trying to get some additional infos from backend to handle the push) |
As seen in #4789 (comment) Android caches false results for 2 seconds, so if your app made a request in doze mode or similar, or the connection was down. Then it will fail until the Android cache is cleared after 2 seconds. |
Are you implying that we should check for the current network state and the current doze state before using OkHttp to make sure we don't waste 2 seconds? |
No, maybe. Android has some edge cases that will cause these types of persistent failures. Short of handling them specifically, the best you can do is be aware of these android issues. |
Android fix for this https://cs.android.com/android/_/android/platform/frameworks/base/+/0fa9120b8f72916951b2d070afd6c3dfd3c13f77. I'm not sure if this goes into mainline modules, but I'm assuming not. |
Wait... so on network restore I can simply delay the network stack activation to get around this bug??? |
I've had this happen to a good 10% of my devices, affecting ~1000 users. However, I cannot seem to reproduce this issue. It only happens when connected over Wifi, and multiple users have confirmed me they weren't behind any captive portals or anything. They could access the url normally when typing it into a browser. There isn't any good indicator this is a vendor-specific issue either. My statistics show a majority of samsung devices,
![grafik](https://user-images.githubusercontent.com/10694485/37275922-bcdc1ac0-25e0-11e8-88eb-085e6a68bb12.png)
but then again, it works flawlessly on my samsung test device. I don't know what to make of it. Here's the stack trace:
If you need more information, don't hesitate to ask.
The text was updated successfully, but these errors were encountered: