Closed
Bug 991923
Opened 11 years ago
Closed 10 years ago
Server not found when switching networks on Android 3-5
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(firefox34 wontfix, firefox35 wontfix, firefox36 verified, firefox37 verified, relnote-firefox 36+, fennec34+)
VERIFIED
FIXED
Firefox 37
People
(Reporter: microrffr, Assigned: esawin)
References
Details
Attachments
(3 files, 2 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
sworkman
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
After I leave a WiFi access point and my phone switches to LTE, trying to open a page shows the "server not found" error. It does this until I restart Nightly (I swipe it from the recent apps and launch it again). Meanwhile, other apps can communicate fine. I'm able to use a terminal emulator to run nslookup commands, and I get results back.
Comment 1•11 years ago
|
||
I've seen this also when the cell signal is switching between 3G and 4G.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 2•11 years ago
|
||
I've seen similar behavior before, though I've never switched apps to verify the phone is able to still access the network. However, the network indicators are still active and seem to indicate the phone has data access.
This also occurs when switching between wifi and LTE for me.
Updated•11 years ago
|
tracking-fennec: --- → ?
Comment 3•11 years ago
|
||
I've seen this, but usually just refreshing the page makes stuff work again. I don't think I've had to restart ever.
Comment 4•11 years ago
|
||
Patrick, any thoughts here? Apparently we're not handling switching networks as well as other apps on the system.
tracking-fennec: ? → +
Flags: needinfo?(mcmanus)
Comment 5•11 years ago
|
||
this is in scope for 939318 which :bagder is working on RIGHT NOW! :)
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(mcmanus)
Resolution: --- → DUPLICATE
This still happens with bug 939318 fixed.
Right now, all I have to do is:
1. launch Nightly with wifi on
2. load a page
3. turn off wifi
4. try to refresh the page
It immediately shows the server not found error page.
Meanwhile, search suggestions work.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee | ||
Comment 8•10 years ago
|
||
The previous comment's device is a Galaxy S4, I'm reproducing this on Nexus 5 (Android 5) on Fennec 34 and up.
Updated•10 years ago
|
tracking-fennec: + → ?
Assignee | ||
Comment 9•10 years ago
|
||
Bug 1024614 introduced DNS cache flushing on link status change, which seems to cause some kind of stale state in the resolver in this case.
Assignee | ||
Comment 10•10 years ago
|
||
Restarting the resolver after flushing fixes it, so there is an issue with the resolver?
Flags: needinfo?(snorp)
Comment 11•10 years ago
|
||
Great find. Just a quick note that shutting down and re-initing the resolver causes other problems (basically there's a short time period where resolves don't work) so it isn't the correct fix.
The question is then - I guess - what in the FlushCache() method is missing?
Comment 12•10 years ago
|
||
It is probably the call to res_ninit() from within nsHostResolver::Init() that actually fixes the problem ? Although the logic in ThreadFunc should already call res_ninit() on name resolve failures so I'm not sure.
Comment 13•10 years ago
|
||
It looks like the configure check for res_ninit is failing, so we don't call it on failure (or ever). Output in config.log is (on my mac, at least):
configure:13193: checking for res_ninit()
configure:13216: /tools/android-ndk-r10b/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-gcc -o conftest -mandroid -fno-short-enums -fno-exceptions -fdiagnostics-color -march=armv7-a -mthumb -mfpu=vfp -mfloat-abi=softfp -mno-unaligned-access -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -idirafter /tools/android-ndk-r10b/platforms/android-9/arch-arm/usr/include -fdiagnostics-color -mandroid -L/tools/android-ndk-r10b/platforms/android-9/arch-arm/usr/lib -Wl,-rpath-link=/tools/android-ndk-r10b/platforms/android-9/arch-arm/usr/lib --sysroot=/tools/android-ndk-r10b/platforms/android-9/arch-arm -llog -Wl,--allow-shlib-undefined -fdiagnostics-color -mthumb -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id conftest.c 1>&5
^[[01m^[[Kconfigure:^[[m^[[K In function '^[[01m^[[Kmain^[[m^[[K':
^[[01m^[[Kconfigure:13212:1:^[[m^[[K ^[[01;35m^[[Kwarning: ^[[m^[[Kimplicit declaration of function '^[[01m^[[Kres_ninit^[[m^[[K' [-Wimplicit-function-declaration]
In file included from ^[[01m^[[Kconfigure:13209:0^[[m^[[K:
^[[01m^[[Kconfigure:13212:22:^[[m^[[K ^[[01;31m^[[Kerror: ^[[m^[[Kdereferencing pointer to incomplete type
configure: failed program was:
#line 13201 "configure"
#include "confdefs.h"
#ifdef linux
#define _BSD_SOURCE 1
#endif
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/nameser.h>
#include <resolv.h>
int main() {
int foo = res_ninit(&_res);
; return 0; }
I can't explain why restarting nsHostResolver would do anything in light of this.
Flags: needinfo?(snorp)
Comment 14•10 years ago
|
||
(In reply to Eugen Sawin [:esawin] from comment #9)
> Bug 1024614 introduced DNS cache flushing on link status change, which seems
> to cause some kind of stale state in the resolver in this case.
It seems likely that resolving has been busted the whole time, but prior to the flushing we just used the cached values.
Comment 15•10 years ago
|
||
That configure check looks wrong to me. '_res' is supposedly an available global variable used by some res_* functions but res_ninit() itself was made partly to get away from that global struct and instead we pass in our own struct pointer as argument! So, assuming _res together with res_ninit() seems wrong.
I don't have an android build setup here, but I would imagine a simple configure adjustment like this:
--- a/configure.in
+++ b/configure.in
@@ -3079,11 +3079,12 @@ AC_CACHE_CHECK(
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/nameser.h>
#include <resolv.h>
],
- [int foo = res_ninit(&_res);],
+ [ res_state rs;
+ int foo = res_ninit(&rs);],
[ac_cv_func_res_ninit=yes],
[ac_cv_func_res_ninit=no])
fi
])
Comment 16•10 years ago
|
||
Here's the same patch I just mentioned in a comment as a separate clean patch
Comment 17•10 years ago
|
||
(In reply to Daniel Stenberg [:bagder] from comment #15)
But we invoke it that same way in nsHostResolver. So if the test is wrong, so is the actual code where we use it.
AFAICT, there is no res_ninit() definition in the NDK for android-9 (which is what we build against), so I don't see how the test can ever work. I just noticed, however, we have a bunch of resolver stuff in other-licenses/android. Presumably that's the getaddrinfo() impl we end up using. There is no public res_ninit() there either (it's in the resolv_private.h), but maybe some changes are necessary? :sworkman added it so maybe he has an idea?
Flags: needinfo?(sworkman)
Comment 18•10 years ago
|
||
Yes, if the test works after my patch then the code in nsHostResolver.cpp will still fail on Android but that would then be easy to fix too in the same way.
I found res_ninit() for Android on the link below which made me assume it is present:
https://code.google.com/p/android-source-browsing/source/browse/libc/netbsd/resolv/res_init.c?repo=platform--bionic&r=7f84da69f86ed9daf610c8d1129392ba3f7c4405#166
Comment 19•10 years ago
|
||
The bionic res_compat.c file has this comment about _res:
"Binary Compatibility; this symbol does not appear in a header file"
Assignee | ||
Comment 20•10 years ago
|
||
Without res_ninit support, we end up with resolver threads unable to successfully process getaddrinfo requests. A possible fallback could be to simply unwind and restart the resolver threads in this case.
I'm not overly comfortable with having the procedure within FlushCache, but that can be refactored out before landing.
This is a complementary patch to Daniel's, but even with the res_init fallback on Android, we end up in the same situation (I can look into that, too).
Is this something we want?
Attachment #8532592 -
Attachment is obsolete: true
Attachment #8534359 -
Flags: feedback?(daniel)
Assignee | ||
Comment 21•10 years ago
|
||
Comment on attachment 8534359 [details] [diff] [review]
dns-restart-link-change
I'm looking into an alternative way to fix this without thread restarting.
Attachment #8534359 -
Attachment is obsolete: true
Attachment #8534359 -
Flags: feedback?(daniel)
Comment 22•10 years ago
|
||
It looks like the Android resolver actually already tries to detect when a DNS configuration change occurred, and reinit appropriately[0]. My guess is that this isn't working on newer versions of Android for whatever reason. Eugen's looking into making that machinery work again, I believe.
[0] http://dxr.mozilla.org/mozilla-central/source/other-licenses/android/res_data.c?from=res_data.c&case=true#1
Comment 23•10 years ago
|
||
More specific link than the one above: http://dxr.mozilla.org/mozilla-central/source/other-licenses/android/res_data.c?from=res_data.c&case=true#99
Updated•10 years ago
|
tracking-fennec: ? → 34+
status-firefox34:
--- → affected
status-firefox35:
--- → affected
status-firefox36:
--- → affected
status-firefox37:
--- → affected
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → esawin
Comment 25•10 years ago
|
||
Sorry it's taken me so long to comment here.
My guess is that Comment #22 in combination with bug 1110529 may be the problem here. So, we weren't calling out to the system DNS daemon, which re-init's itself on DNS changes (e.g. see ConnectivityService.java, bumpDns()), and I don't think our forked resolver for pre-honeycomb was re-init'ing at all - I'm not 100% certain of this one, so it's worth verifying.
Seems like you're working off of that basis anyway, so good job!
Flags: needinfo?(sworkman)
Assignee | ||
Comment 26•10 years ago
|
||
With bug 694325 introduced, we are supposed to handle DNS requests on pre-Honeycomb Android versions using the in-tree DNS resolver and fallback to the system resolver for Android 3 and up.
However, the fallback has been broken, so have been using the in-tree resolver on all versions.
Bug 1109940 and bug 1110529 fix the symbol wrapping for the in-tree resolver. To fix DNS resolution for Android 3 and up, we only need to fix the Android SDK version detection (will be handled in this bug).
To fix DNS resolution for Android 2.x, we need to make sure that res_(n)init is called on link status changes and that res_get_dns_changed is appropriately set in this case (this will also be handled in this bug).
Assignee | ||
Comment 27•10 years ago
|
||
Set Android SDK version, required for Android 3.x+ fallback to system DNS resolver.
Attachment #8536682 -
Flags: review?(sworkman)
Comment 28•10 years ago
|
||
You think you can release an apk once you feel confident for testing. I am really eager to test.
Comment 29•10 years ago
|
||
(In reply to lostinignorance from comment #28)
> You think you can release an apk once you feel confident for testing. I am
> really eager to test.
It will be in Nightly once it's checked in, which should happen shortly after reviews have concluded.
Assignee | ||
Comment 30•10 years ago
|
||
Try for dns-resolver-sdk-version: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=016e033d542d
(In reply to lostinignorance from comment #28)
> You think you can release an apk once you feel confident for testing. I am
> really eager to test.
This should work for Android 3-5:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/esawin@mozilla.com-016e033d542d/try-android-api-11/fennec-37.0a1.en-US.android-arm.apk
Comment 31•10 years ago
|
||
With the patch posted, I am still unable to get it to resolve any DNS names. I am able to pull pages over the VPN using IP, but not using any DNS. Is there anything I can post to help?
Comment 32•10 years ago
|
||
The build posted works for me on my Nexus 4 (Android 5.0.1, simple OpenVPN client/server setup).
Comment 33•10 years ago
|
||
(In reply to ryan from comment #32)
> The build posted works for me on my Nexus 4 (Android 5.0.1, simple OpenVPN
> client/server setup).
\o/
Comment 34•10 years ago
|
||
Using Private Internet Access on Android 5.0.1 with a Nexus 5 I can report that the Nightly APK provided is working with the PIA VPN app. Immediately upon activating my VPN the latest stable Firefox stoppped connecting and the Nightly continued working.
I will continue playing with the Nightly and my VPN. If the issue appears again I will report back, but SK far so good for the Nightly.
Comment 35•10 years ago
|
||
Chiming in to confirm that the posted build works on Nexus 4, Android 5.0, using OpenVPN.
Comment 36•10 years ago
|
||
The DNS issue works fine now. There is an issue with NTLM challenge, but the dns issue works with Cisco anyconnect VPN.
Comment 37•10 years ago
|
||
forget to mention my device, works using Cisco AnyConnect with my 4.4.4 samsung Tab S
Comment 38•10 years ago
|
||
I have reloaded my phone (as 5.0.1 came out for the Nexus 5) and I can confirm it is now working on the nightly build referenced above.
Comment 39•10 years ago
|
||
Comment on attachment 8536682 [details] [diff] [review]
dns-resolver-sdk-version
Review of attachment 8536682 [details] [diff] [review]:
-----------------------------------------------------------------
Cool - just take a look at strtol. Otherwise, r=me.
::: other-licenses/android/getaddrinfo.c
@@ +428,5 @@
> +#endif
> + return len;
> + }
> +
> + return (int)strtol(version_str, NULL, 10);
strtol(version_str, version_str+len, 10);
no?
Attachment #8536682 -
Flags: review?(sworkman) → review+
Comment 40•10 years ago
|
||
(In reply to Steve Workman [:sworkman] (please use needinfo) from comment #39)
> strtol(version_str, version_str+len, 10);
>
> no?
Nope:
long int strtol(const char *nptr, char **endptr, int base);
and 'endptr' is allowed to be NULL, I'd say the patch's use if it looks fine.
Assignee | ||
Comment 41•10 years ago
|
||
I've tested on Android 2.3.6 (Nexus S) and switching between networks didn't cause any issues with our in-tree resolver. I also don't see any reports on issues with older Android version, so I'm adjusting the bug title.
Assignee | ||
Updated•10 years ago
|
Summary: server not found when switching networks → Server not found when switching networks on Android 3-5
Assignee | ||
Comment 42•10 years ago
|
||
Assignee | ||
Comment 43•10 years ago
|
||
Correct try push: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=016e033d542d
Assignee | ||
Updated•10 years ago
|
Status: REOPENED → ASSIGNED
Comment 44•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 37
Assignee | ||
Comment 45•10 years ago
|
||
Comment on attachment 8536682 [details] [diff] [review]
dns-resolver-sdk-version
If approved, it should land after bug 1110529.
Approval Request Comment
[Feature/regressing bug #]: bug 694325
[User impact if declined]: major connectivity issues when switching networks or using VPNs
[Describe test coverage new/current, TBPL]: has been on nightly for some days
[Risks and why]: moderate, fallback to the Android DNS resolver hasn't worked for the past 3 years so this could have side effects, but the benefit is high in terms of increased stability
[String/UUID change made/needed]: none
Attachment #8536682 -
Flags: approval-mozilla-aurora?
Comment 47•10 years ago
|
||
Since we have this bug for a while, I guess it is a wontfix for 35.
Updated•10 years ago
|
Attachment #8536682 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 48•10 years ago
|
||
Comment 49•10 years ago
|
||
I have been using Aurora for several days now as a successful work around for this bug.
I'm not a developer, but I'm happy to provide any info that will help expedite a solution for Firefox. I am using using PIA VPN on a Samsung Galaxy S5 with T-mobile and wifi, with APN protocol set to IPv4. Firefox could not load pages through a VPN, but Aurora is on fire.
Comment 50•10 years ago
|
||
This fix will be rolled out with Firefox 36 in Februrary.
Comment 52•10 years ago
|
||
Thanks for fixing it! Does that mean users with Play Store Firefox will be exposed to dns leaks (for certain people a big privacy issue) for another two months?
Comment 54•10 years ago
|
||
Verified as fixed in builds:
- 36 Beta 1;
- 37.0a1 (2015-01-11);
Device: Nexus 5 (Android 5.0.1).
Assignee | ||
Comment 55•10 years ago
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]: Broken VPN support and general connectivity issues have been an issue for a long time, a lot of users require VPN to connect to/from company networks or for security reasons
[Suggested wording]: Fixed DNS resolution when using VPNs or switching networks
[Links (documentation, blog post, etc)]:
relnote-firefox:
--- → ?
Comment 56•10 years ago
|
||
relnoted in 36 as "Fixed: DNS resolution when using VPNs or switching networks"
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•