Closed
Bug 968200
Opened 11 years ago
Closed 11 years ago
Intermittent uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml | application timed out after 330 seconds with no output because we run out of disk space
Categories
(Core :: Networking, defect)
Tracking
()
RESOLVED
FIXED
mozilla30
Tracking | Status | |
---|---|---|
firefox28 | --- | fixed |
firefox29 | --- | fixed |
firefox30 | --- | fixed |
firefox-esr24 | --- | unaffected |
People
(Reporter: cbook, Assigned: ehsan.akhgari)
References
(Depends on 1 open bug, )
Details
(Keywords: intermittent-failure)
Attachments
(2 files)
Android 2.2 Tegra mozilla-inbound opt test mochitest-8 on 2014-02-05 04:32:21 PST for push 537c83f1c323
slave: tegra-249
https://tbpl.mozilla.org/php/getParsedLog.php?id=34130392&tree=Mozilla-Inbound
TEST-UNEXPECTED-FAIL | /tests/uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml | application timed out after 330 seconds with no output
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 18•11 years ago
|
||
Ehsan, can you please take a look at this? I think this has been an ongoing failure that was just obscured by the 2400s timeouts in bug 663657 until the recent harness change that better exposed the underlying test issues.
Component: General → File Handling
Flags: needinfo?(ehsan)
Product: Firefox for Android → Core
Updated•11 years ago
|
Summary: Intermittent TEST-UNEXPECTED-FAIL | /tests/uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml | application timed out after 330 seconds with no output → Intermittent uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml | application timed out after 330 seconds with no output
Assignee | ||
Comment 19•11 years ago
|
||
Do you know if the output of window.dump() goes anywhere in these tests?
Flags: needinfo?(ehsan) → needinfo?(ryanvm)
Comment 20•11 years ago
|
||
I somehow doubt it. ISTR that Android doesn't get a lot of useful stuff in the logs due to how the harness works. I would ask in #ateam.
Flags: needinfo?(ryanvm)
Assignee | ||
Comment 21•11 years ago
|
||
OK, pushed a debugging patch to use info() instead for a whole bunch of things. This should give us an idea on where the failure is roughly the next time this test fails:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3a10c4605795
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee: nobody → ehsan
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Assignee | ||
Updated•11 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 28•11 years ago
|
||
Whiteboard: [leave open]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 30•11 years ago
|
||
Brad, is there a good way of getting NSPR log output on Android on the try server? I suspect that there may be an actual bug here. We just don't seem to call into the external app handler here at all, so I no longer think this is a test bug.
Flags: needinfo?(blassey.bugs)
Comment 31•11 years ago
|
||
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 35•11 years ago
|
||
Ehsan, the best answer I have is to patch GeckoLoader.setupGeckoEnvironment() to add an env var to turn on NSPR logging.
NI'ing gbrown to see if he has a better idea.
https://mxr.mozilla.org/mozilla-central/source/mobile/android/base/mozglue/GeckoLoader.java.in#122
Flags: needinfo?(blassey.bugs) → needinfo?(gbrown)
Assignee | ||
Comment 36•11 years ago
|
||
Thanks! Will that cause the log to go to the test output? If yes, that should work well for me.
Comment 37•11 years ago
|
||
There is a fairly new option here, for enabling NSPR logging in mochitests: http://hg.mozilla.org/mozilla-central/annotate/d8d8fa98ee7d/testing/mochitest/runtests.py#l56. I have not tried using that on Android, but it looks promising.
Flags: needinfo?(gbrown)
Assignee | ||
Comment 38•11 years ago
|
||
Thanks a lot! Here's a try push which should hopefully give us the NSPR logs, and help us understand what's going on.
Assignee | ||
Comment 39•11 years ago
|
||
Assignee | ||
Comment 40•11 years ago
|
||
Hmm, I don't see any links to download the NSPR logs from <https://tbpl.mozilla.org/php/getParsedLog.php?id=34478849&full=1&branch=try>. What am I doing wrong?
Flags: needinfo?(gbrown)
Comment 41•11 years ago
|
||
Darn. I suspect you didn't do anything wrong, but this is not working the way it should on Android.
I think there are 2 problems:
1. On Android 4.0 Opt, there is a failure:
19:00:25 INFO - Mochi-Remote ERROR | Automation Error: Exception caught while running tests
19:00:25 INFO - Traceback (most recent call last):
19:00:25 INFO - File "/builds/panda-0097/test/build/tests/mochitest/runtestsremote.py", line 731, in main
19:00:25 INFO - retVal = mochitest.runTests(options)
19:00:25 INFO - File "/builds/panda-0097/test/build/tests/mochitest/runtests.py", line 1069, in runTests
19:00:25 INFO - with zipfile.ZipFile("%s/nsprlog.zip" % browserEnv["MOZ_UPLOAD_DIR"], "w", zipfile.ZIP_DEFLATED) as logzip:
19:00:25 INFO - KeyError: 'MOZ_UPLOAD_DIR'
2. On Android 2.2 Opt, blobber uploads are not working -- probably because those tests do not use mozharness.
I'll open a bug for (1) and ask around about (2) -- will try to find a solution for you today.
Flags: needinfo?(gbrown)
Assignee | ||
Comment 42•11 years ago
|
||
(In reply to comment #41)
> Darn. I suspect you didn't do anything wrong, but this is not working the way
> it should on Android.
>
> I think there are 2 problems:
>
> 1. On Android 4.0 Opt, there is a failure:
> 19:00:25 INFO - Mochi-Remote ERROR | Automation Error: Exception caught
> while running tests
> 19:00:25 INFO - Traceback (most recent call last):
> 19:00:25 INFO - File
> "/builds/panda-0097/test/build/tests/mochitest/runtestsremote.py", line 731, in
> main
> 19:00:25 INFO - retVal = mochitest.runTests(options)
> 19:00:25 INFO - File
> "/builds/panda-0097/test/build/tests/mochitest/runtests.py", line 1069, in
> runTests
> 19:00:25 INFO - with zipfile.ZipFile("%s/nsprlog.zip" %
> browserEnv["MOZ_UPLOAD_DIR"], "w", zipfile.ZIP_DEFLATED) as logzip:
> 19:00:25 INFO - KeyError: 'MOZ_UPLOAD_DIR'
Hmm, I'm not sure what this means...
> 2. On Android 2.2 Opt, blobber uploads are not working -- probably because
> those tests do not use mozharness.
>
> I'll open a bug for (1) and ask around about (2) -- will try to find a solution
> for you today.
Thanks! FWIW if there's any other way to get the NSPR logs, it will be fine with me!
Comment 43•11 years ago
|
||
The "normal" way to obtain NSPR logs in tbpl is to copy them to the "blobber" aws server. On Android 4.0, blobber is supported, but the NSPR -> blobber code is broken -- I'll fix that.
But you probably really want an Android 2.2 log, and blobber is not supported there. I asked and there are conflicting opinions about how difficult it would be to support blobber on Android 2.2. Rather than investigate that, let's just cat the nspr logs and see if that works:
https://tbpl.mozilla.org/?tree=Try&rev=5cf4e3f38015
Assignee | ||
Comment 44•11 years ago
|
||
(In reply to comment #43)
> The "normal" way to obtain NSPR logs in tbpl is to copy them to the "blobber"
> aws server. On Android 4.0, blobber is supported, but the NSPR -> blobber code
> is broken -- I'll fix that.
Thanks!
> But you probably really want an Android 2.2 log,
Correct.
> and blobber is not supported
> there. I asked and there are conflicting opinions about how difficult it would
> be to support blobber on Android 2.2. Rather than investigate that, let's just
> cat the nspr logs and see if that works:
>
> https://tbpl.mozilla.org/?tree=Try&rev=5cf4e3f38015
Thanks for your help! :-)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 49•11 years ago
|
||
Another try: https://tbpl.mozilla.org/?tree=Try&rev=5078af5501df
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 51•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #49)
> Another try: https://tbpl.mozilla.org/?tree=Try&rev=5078af5501df
It looks like nothing went wrong in that run, but I don't see any NSPR output. I will debug locally...
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 55•11 years ago
|
||
This one gives NSPR output: https://tbpl.mozilla.org/?tree=Try&rev=22a22e7f14b1
Ehsan -- hope this gives you what you need.
Updated•11 years ago
|
Flags: needinfo?(ehsan)
Assignee | ||
Comment 56•11 years ago
|
||
Yes it does, thanks a lot! Now, all we need is to get this test to fail. :-)
Flags: needinfo?(ehsan)
Comment 57•11 years ago
|
||
I'm trying! :P
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 68•11 years ago
|
||
(In reply to comment #57)
> I'm trying! :P
It worked. I'll try to look at the failure log as soon as I get some free time...
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 74•11 years ago
|
||
So there is this line in the log:
OnStopRequest[60976830](http://mochi.test:8888/tests/uriloader/exthandler/tests/mochitest/unsafeBidiFileName.sjs?name=%C3%A2%C2%80%C2%AA.test) status=804b0004 mIsLoadingDocument=true, 0 active URLs
which is basically where the load stops. 0x804b0004 is NS_NET_STATUS_CONNECTED_TO, which is only set from here: http://dxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsSocketTransport2.cpp#1572
Time to turn on the socket transport logging: https://tbpl.mozilla.org/?tree=Try&rev=b8f2f9cc414b
And moving the bug to Core::Networking because it seems like we're hitting a bug there.
Component: File Handling → Networking
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 78•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=f357ba820f5f
Hmm, this log <https://tbpl.mozilla.org/php/getParsedLog.php?id=34746774&tree=Try&full=1#error0> doesn't include any socket transport logs. Not sure what I'm doing wrong... :/
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 115•11 years ago
|
||
ehsan, re: comment did you include "nsSocketTransport:5" in the list of NSPR modules to log?:
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(ehsan)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 117•11 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #115)
> ehsan, re: comment did you include "nsSocketTransport:5" in the list of
> NSPR modules to log?:
>
> https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Yes, any idea why that wouldn't work on Android?
Flags: needinfo?(ehsan) → needinfo?(jduell.mcbugs)
Comment 118•11 years ago
|
||
ehsan: are you getting any NSPR log at all, or just missing the nsSocketTransport msgs?
Flags: needinfo?(jduell.mcbugs) → needinfo?(ehsan)
Comment 119•11 years ago
|
||
In the logs of Comment 78, I see DocLoader NSPR messages, but nothing else. I wonder if we are losing part of the environment variable somewhere. It might be worth trying with just nsSocketTransport:5, or changing the order of modules so that nsSocketTransport is first: "nsSocketTransport:5,DocLoader:5,..."
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 124•11 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #118)
> ehsan: are you getting any NSPR log at all, or just missing the
> nsSocketTransport msgs?
As discussed on irc, I was getting NSPR logs, just not any nsSocketTransport messages. I'll try Geoff's suggestion in comment 119 now.
Flags: needinfo?(ehsan)
Assignee | ||
Comment 125•11 years ago
|
||
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 133•11 years ago
|
||
OK, we have a log now finally! <https://tbpl.mozilla.org/php/getParsedLog.php?id=35298635&tree=Try&full=1#error0>
But this only gave us socket transport logs, not the rest. Seems like nsSocketTransport logs are mutually exclusive with anything else?!
I can't make much sense out of this log. Jason, would you mind lending a hand please? Please see comment 74 for my previous attempt to explain what's happening until we got to the networking code.
Thanks!
Flags: needinfo?(jduell.mcbugs)
Comment 134•11 years ago
|
||
> Seems like nsSocketTransport logs are mutually exclusive with anything else?
No, I think we probably have some sort of Android bug with environment variables getting truncated.
I'll look at this soon--too swamped with other things tonight.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 144•11 years ago
|
||
Pushed to try with: NSPR_LOG_MODULES = "all:5"
https://tbpl.mozilla.org/?tree=Try&rev=f0f27a53af2c
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 148•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #144)
> Pushed to try with: NSPR_LOG_MODULES = "all:5"
>
> https://tbpl.mozilla.org/?tree=Try&rev=f0f27a53af2c
"Output exceeded 52428800 bytes, remaining output has been truncated (output was 52471609 bytes)"
sigh... :(
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 180•11 years ago
|
||
So I guess we're kind of stuck here until the logging issues are sorted out? This is still a top-10 orange :(
Assignee | ||
Comment 181•11 years ago
|
||
(In reply to comment #180)
> So I guess we're kind of stuck here until the logging issues are sorted out?
Yes, unfortunately. I tried anything I could think of to work around that... Suggestion welcome.
> This is still a top-10 orange :(
I really doubt that this is a test problem, and would therefore really hate for this test to be disabled and then people forgetting about this bug, if that makes sense.
Assignee | ||
Comment 182•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #181)
> (In reply to comment #180)
> > So I guess we're kind of stuck here until the logging issues are sorted out?
>
> Yes, unfortunately. I tried anything I could think of to work around
> that... Suggestion welcome.
Flags: needinfo?(gbrown)
Comment 183•11 years ago
|
||
There's still a chance that :jduell will be able to interpret the nsSocketTransport logs, isn't there?
I have had trouble finding time for bug 977589, but will try to push that forward soon.
Flags: needinfo?(gbrown)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 187•11 years ago
|
||
As I mentioned in email, I think the next thing to try here is :mayhemer's recipe for NSPR logging with mochitest:
http://www.janbambas.cz/get-nspr-log-from-tryserver-run/
Flags: needinfo?(jduell.mcbugs)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 190•11 years ago
|
||
(In reply to comment #187)
> As I mentioned in email, I think the next thing to try here is :mayhemer's
> recipe for NSPR logging with mochitest:
>
> http://www.janbambas.cz/get-nspr-log-from-tryserver-run/
Isn't that the thing that doesn't yet work on Android?
(Also, which email? ;-)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 192•11 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #187)
> As I mentioned in email, I think the next thing to try here is :mayhemer's
> recipe for NSPR logging with mochitest:
>
> http://www.janbambas.cz/get-nspr-log-from-tryserver-run/
That relies on blobber (aws) support, which is not available on Android 2.2 -- comment 44.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 194•11 years ago
|
||
Patrick,
Can you look at this? It's a top 10 orange. Per comment 74 it looks like we're somehow getting OnStopRequest called with status==NS_NET_STATUS_CONNECTED_TO, which seems like a weird channel result status. It's getting propagated up somehow via OnTransportStatus.
You can find the channel in question by searching for 5d2d2830 in the log.
Flags: needinfo?(mcmanus)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 198•11 years ago
|
||
I thought of a hack to work around bug 977589:
https://tbpl.mozilla.org/?tree=Try&rev=56545bdc82ed
In this try push, I replace the NSPR_LOG_MODULES delimiter with a slash, and use that in the test runner script. Let's see if this is going to work!
Assignee | ||
Comment 199•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #198)
> I thought of a hack to work around bug 977589:
>
> https://tbpl.mozilla.org/?tree=Try&rev=56545bdc82ed
>
> In this try push, I replace the NSPR_LOG_MODULES delimiter with a slash, and
> use that in the test runner script. Let's see if this is going to work!
It didn't work...
Assignee | ||
Comment 200•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #199)
> (In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness,
> emailapocalypse) from comment #198)
> > I thought of a hack to work around bug 977589:
> >
> > https://tbpl.mozilla.org/?tree=Try&rev=56545bdc82ed
> >
> > In this try push, I replace the NSPR_LOG_MODULES delimiter with a slash, and
> > use that in the test runner script. Let's see if this is going to work!
>
> It didn't work...
Geoff, did I do anything obviously wrong there?
Flags: needinfo?(gbrown)
Comment 201•11 years ago
|
||
When I looked at your push yesterday I was sure it would work!
But now I think you forgot to include my changes from https://hg.mozilla.org/try/rev/0f9f983844d0.
Flags: needinfo?(gbrown)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 204•11 years ago
|
||
(In reply to comment #201)
> When I looked at your push yesterday I was sure it would work!
>
> But now I think you forgot to include my changes from
> https://hg.mozilla.org/try/rev/0f9f983844d0.
Gah, you're right! Here's another attempt: https://tbpl.mozilla.org/?tree=Try&rev=f5f3fe614624
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 207•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #204)
> (In reply to comment #201)
> > When I looked at your push yesterday I was sure it would work!
> >
> > But now I think you forgot to include my changes from
> > https://hg.mozilla.org/try/rev/0f9f983844d0.
>
> Gah, you're right! Here's another attempt:
> https://tbpl.mozilla.org/?tree=Try&rev=f5f3fe614624
So here is a test failure log:
https://tbpl.mozilla.org/php/getParsedLog.php?id=35674796&tree=Try&full=1
But I only see the socket transport log there and *nothing* else. :( What am I doing wrong?
Flags: needinfo?(gbrown)
Comment 208•11 years ago
|
||
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/847c492b44d08aacadb66bf471d329cea8d0cf09f0b743f5daf0beb7bf50a4c470139549fcec38c354f5a6ea2cf5a4c378423b1f636499d388ffd3e6129f58f7 is the logcat for the Android 4.0 run and it shows:
03-05 10:52:12.234 D/GeckoLoader( 2229): env20: NSPR_LOG_MODULES=nsSocketTransport:5/DocLoader:5/URILoader:5/HelperAppService:5
I cannot verify that for Android 2.2, but I'm 99% sure it would be the same.
I don't know prlog.c well, but notice there are 2 functions that access NSPR_LOG_MODULES and both sscanf for "," -- don't you need to change both _PR_InitLog and _PR_SetLogModuleLevel?
Flags: needinfo?(gbrown)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 220•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #208)
> http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/
> 847c492b44d08aacadb66bf471d329cea8d0cf09f0b743f5daf0beb7bf50a4c470139549fcec3
> 8c354f5a6ea2cf5a4c378423b1f636499d388ffd3e6129f58f7 is the logcat for the
> Android 4.0 run and it shows:
>
> 03-05 10:52:12.234 D/GeckoLoader( 2229): env20:
> NSPR_LOG_MODULES=nsSocketTransport:5/DocLoader:5/URILoader:5/
> HelperAppService:5
>
> I cannot verify that for Android 2.2, but I'm 99% sure it would be the same.
>
> I don't know prlog.c well, but notice there are 2 functions that access
> NSPR_LOG_MODULES and both sscanf for "," -- don't you need to change both
> _PR_InitLog and _PR_SetLogModuleLevel?
Yes indeed. I used to be able to write code... ;-)
https://tbpl.mozilla.org/?tree=Try&rev=54177c1d3d22
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 233•11 years ago
|
||
So here’s a useful log containing all of the NSPR log modules I selected:
https://tbpl.mozilla.org/php/getParsedLog.php?id=35773619&tree=Try&full=1#error0
Based on this log, we get to this point: <http://mxr.mozilla.org/mozilla-central/source/uriloader/base/nsURILoader.cpp#260> where we have the following line in the log:
2469704[4f82c080]: After dispatch, m_targetStreamListener: 0x5d2f4500, rv: 0x00000000
After which there is:
2469704[4f82c080]: After dispatch, m_targetStreamListener: 0x5d2f4500, rv: 0x00000000
2469704[4f82c080]: STS dispatch [5fe6d800]
3008312[4f82c880]: ...returned after 1 milliseconds
2469704[4f82c080]: DocLoader:5f8f4400: OnStopRequest[60c06c30](http://mochi.test:8888/tests/uriloader/exthandler/tests/mochitest/unsafeBidiFileName.sjs?name=%C3%A2%C2%80%C2%AA.test) status=804b0004 mIsLoadingDocument=true, 0 active URLs
3008312[4f82c880]: STS poll iter [1]
3008312[4f82c880]: active [0] { handler=5d9ca9c0 condition=0 pollflags=5 }
3008312[4f82c880]: calling PR_Poll [active=1 idle=0]
3008312[4f82c880]: poll timeout: none
3008312[4f82c880]: timeout = -1 milliseconds
2469704[4f82c080]: DocLoader:5f8f4400: Is now idle...
2469704[4f82c080]: STS dispatch [60d29a80]
3008312[4f82c880]: ...returned after 0 milliseconds
2469704[4f82c080]: STS dispatch [60d29a90]
3008312[4f82c880]: STS poll iter [0]
3008312[4f82c880]: active [0] { handler=5d9ca9c0 condition=0 pollflags=5 }
3008312[4f82c880]: calling PR_Poll [active=1 idle=0]
3008312[4f82c880]: poll timeout: none
3008312[4f82c880]: timeout = 0 milliseconds
3008312[4f82c880]: ...returned after 0 milliseconds
3008312[4f82c880]: STS poll iter [1]
3008312[4f82c880]: active [0] { handler=5d9ca9c0 condition=0 pollflags=5 }
3008312[4f82c880]: calling PR_Poll [active=1 idle=0]
3008312[4f82c880]: poll timeout: none
3008312[4f82c880]: timeout = -1 milliseconds
2469704[4f82c080]: STS dispatch [5f8b3080]
3008312[4f82c880]: ...returned after 361 milliseconds
2469704[4f82c080]: Error: diskFull, type=1, listener=0x0, rv=0x8052000A
3008312[4f82c880]: STS poll iter [1]
2469704[4f82c080]: path='/mnt/sdcard/Download/PXY+h9oc.exe.part'
Note the strange thing here that “OnStartRequest returning” on <http://mxr.mozilla.org/mozilla-central/source/uriloader/base/nsURILoader.cpp#270> never appears in the log, but note the diskFull error above! That comes from here: <http://mxr.mozilla.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#1818> and it seems like we just call prompter->Alert, show an alert to the user who is not watching the device, and never return back to the event loop until we’re killed, which explains why the “OnStartRequest returning” message is never shown.
(When reading the log note that bug 949991 recently landed which touches this code and changes the log messages, but that patch was not in the tree from which this log was generated.)
We can of course mock the prompt service to cause it to never show an alert (and perhaps we should do it in the test harness to get it print something useful to the log), but what’s our story about protecting against the disk getting full on Android?
Flags: needinfo?(mcmanus)
Flags: needinfo?(gbrown)
Flags: needinfo?(blassey.bugs)
Summary: Intermittent uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml | application timed out after 330 seconds with no output → Intermittent uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml | application timed out after 330 seconds with no output because we run out of disk space
Comment 234•11 years ago
|
||
For the test devices, we try to have lots of free space, and clean up known tmp files between test runs.
The tests show the available space before and after each run. For your latest log:
before -- 'disk': ['/data: 221642752 total, 156811264 available', '/system: 250740736 total, 150310912 available', '/mnt/sdcard: 15911780352 total, 15841263616 available']
after -- 'disk': ['/data: 221642752 total, 156770304 available', '/system: 250740736 total, 150310912 available', '/mnt/sdcard: 15911780352 total, 15841214464 available']
So 15+ GB available on /mnt/sdcard. It seems unlikely that we are filling that up during the test -- maybe the diskFull error is wrong? Related to the file names in this test? Maybe the format of the sdcard on certain devices doesn't allow the characters used in the file names??
Flags: needinfo?(gbrown)
Assignee | ||
Comment 235•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #234)
> For the test devices, we try to have lots of free space, and clean up known
> tmp files between test runs.
>
> The tests show the available space before and after each run. For your
> latest log:
>
> before -- 'disk': ['/data: 221642752 total, 156811264 available', '/system:
> 250740736 total, 150310912 available', '/mnt/sdcard: 15911780352 total,
> 15841263616 available']
>
> after -- 'disk': ['/data: 221642752 total, 156770304 available', '/system:
> 250740736 total, 150310912 available', '/mnt/sdcard: 15911780352 total,
> 15841214464 available']
What are the mount points on this device? Is it possible that /mnt/sdcard/Download is mounted on a different volume?
> So 15+ GB available on /mnt/sdcard. It seems unlikely that we are filling
> that up during the test -- maybe the diskFull error is wrong? Related to the
> file names in this test? Maybe the format of the sdcard on certain devices
> doesn't allow the characters used in the file names??
This is the temp file path: /mnt/sdcard/Download/PXY+h9oc.exe.part
The only character out of the ordinary there is +, but I don't know any filesystem which blocks that. What filesystem do we use on these machines?
I'm preparing another try push with more logging to see where exactly this error is coming from.
Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(gbrown)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 237•11 years ago
|
||
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 242•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #233)
> We can of course mock the prompt service to cause it to never show an alert
> (and perhaps we should do it in the test harness to get it print something
> useful to the log), but what’s our story about protecting against the disk
> getting full on Android?
Are you talking in general or on our test machines? In general, we don't really have a story. We'd like to move our caches to the designated cache directory such that the OS can clear them away on disk pressure. Necko can't handle that yet though.
Are there any disk full scenarios you're particularly worried about?
Geoff would know better about the test machines.
Flags: needinfo?(blassey.bugs)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 244•11 years ago
|
||
(In reply to comment #242)
> (In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness,
> emailapocalypse) from comment #233)
> > We can of course mock the prompt service to cause it to never show an alert
> > (and perhaps we should do it in the test harness to get it print something
> > useful to the log), but whatâs our story about protecting against the disk
> > getting full on Android?
>
> Are you talking in general or on our test machines? In general, we don't really
> have a story. We'd like to move our caches to the designated cache directory
> such that the OS can clear them away on disk pressure. Necko can't handle that
> yet though.
>
> Are there any disk full scenarios you're particularly worried about?
No, just the test machines. We show a prompt when this happens, so there is no problem with the user facing behavior, it's just that there is nobody to dismiss that prompt when we're running tests!
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 250•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #235)
> What are the mount points on this device? Is it possible that
> /mnt/sdcard/Download is mounted on a different volume?
> What filesystem do we use on these machines?
Here is the output from "mount" on a tegra, at the start of mochitest-8:
rootfs / rootfs ro,relatime 0 0
tmpfs /dev tmpfs rw,relatime,mode=755 0 0
devpts /dev/pts devpts rw,relatime,mode=600 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
tmpfs /mnt/asec tmpfs rw,relatime,mode=755,gid=1000 0 0
/dev/block/mtdblock3 /system yaffs2 ro,relatime 0 0
/dev/block/mtdblock6 /data yaffs2 rw,nosuid,nodev,relatime 0 0
/dev/block/mtdblock4 /cache yaffs2 rw,nosuid,nodev,relatime 0 0
/dev/block/mtdblock0 /misc yaffs2 rw,nosuid,nodev,relatime 0 0
/dev/block/vold/179:9 /mnt/sdcard vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0702,dmask=0702,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0
/dev/block/vold/179:9 /mnt/secure/asec vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0702,dmask=0702,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0
tmpfs /mnt/sdcard/.android_secure tmpfs ro,relatime,size=0k,mode=000 0 0
And here is the output from "df":
/dev: 451240K total, 12K used, 451228K available (block size 4096)
/mnt/asec: 451240K total, 0K used, 451240K available (block size 4096)
/system: 244864K total, 97948K used, 146916K available (block size 4096)
/data: 216832K total, 60040K used, 156792K available (block size 4096)
/cache: 32768K total, 1156K used, 31612K available (block size 4096)
/misc: 2048K total, 1156K used, 892K available (block size 4096)
/mnt/sdcard: 15550208K total, 98848K used, 15451360K available (block size 32768)
/mnt/secure/asec: 15550208K total, 98848K used, 15451360K available (block size 32768)
So it looks like /mnt/sdcard/Download/... is on /mnt/sdcard, which is vfat, and has 15+GB available.
I am trying to get the mount info for a tegra that fails the test, just in case that is any different.
Assignee | ||
Comment 251•11 years ago
|
||
Thanks, Geoff. Much appreciated.
So, on try, so far I've confirmed that it's the nsILocalFile::CreateUnique that is failing, like I suspected. I'm running another patch with some NSPR logging on try <https://tbpl.mozilla.org/?tree=Try&rev=98571633d685> to see what open() returns here, but the chances of something non-filesystem happening here are quickly diminishing...
Can you also do an ls -la /mnt/sdcard/Download on one of these tegras, preferably one that fails this test? And can you please see if you can create a file in that directory through adb shell?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 253•11 years ago
|
||
Here is mount/df output from a tegra that failed this test. I do not see any significant difference (this device has an 8 GB sdcard, but I think that's just a coincidence).
rootfs / rootfs ro,relatime 0 0
tmpfs /dev tmpfs rw,relatime,mode=755 0 0
devpts /dev/pts devpts rw,relatime,mode=600 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
tmpfs /mnt/asec tmpfs rw,relatime,mode=755,gid=1000 0 0
/dev/block/mtdblock3 /system yaffs2 ro,relatime 0 0
/dev/block/mtdblock6 /data yaffs2 rw,nosuid,nodev,relatime 0 0
/dev/block/mtdblock4 /cache yaffs2 rw,nosuid,nodev,relatime 0 0
/dev/block/mtdblock0 /misc yaffs2 rw,nosuid,nodev,relatime 0 0
/dev/block/vold/179:9 /mnt/sdcard vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0702,dmask=0702,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0
/dev/block/vold/179:9 /mnt/secure/asec vfat rw,dirsync,nosuid,nodev,noexec,relatime,uid=1000,gid=1015,fmask=0702,dmask=0702,allow_utime=0020,codepage=cp437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro 0 0
tmpfs /mnt/sdcard/.android_secure tmpfs ro,relatime,size=0k,mode=000 0 0
/dev: 451240K total, 12K used, 451228K available (block size 4096)
/mnt/asec: 451240K total, 0K used, 451240K available (block size 4096)
/system: 244864K total, 97948K used, 146916K available (block size 4096)
/data: 216832K total, 63208K used, 153624K available (block size 4096)
/cache: 32768K total, 1156K used, 31612K available (block size 4096)
/misc: 2048K total, 1156K used, 892K available (block size 4096)
/mnt/sdcard: 7763120K total, 38052K used, 7725068K available (block size 4096)
/mnt/secure/asec: 7763120K total, 38052K used, 7725068K available (block size 4096)
I will try ls -la and creating a file (adb is probably not an option, but I should be able to exec something like date > /mnt/sdcard/Download/testfile).
Comment 254•11 years ago
|
||
CreateUnique is the thing that works 9999 times (per slave, if you fail to actually delete the file), and then fails obscurely, isn't it?
Comment 255•11 years ago
|
||
https://secure.pub.build.mozilla.org/buildapi/recent/tegra-199?numbuilds=1000 and put mochitest-8 in the search box, that tegra was fine until the 3rd, remains fine for other tests, but has failed every M8 since the first failure. https://secure.pub.build.mozilla.org/buildapi/recent/tegra-249?numbuilds=5000 is more of a go-getter, so 1000 jobs doesn't even get you back to where (assuming I'm right) it got its filename9999.whatever and became unable to CreateUnique and unable to run this test, back in January.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 257•11 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #254)
> CreateUnique is the thing that works 9999 times (per slave, if you fail to
> actually delete the file), and then fails obscurely, isn't it?
Oh, you're talking about this limitation right? <http://dxr.mozilla.org/mozilla-central/source/xpcom/io/nsLocalFileCommon.cpp#49>
So I looked at my try pushes so far and as far as I can see, not two of them have run on the same slave yet :( But I don't think this is what's happening here since we start off with a randomly generated file name <http://mxr.mozilla.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#1399>, and the only way for us to fall into the 9999 maximum file limitation is if we start with the same base name...
Comment 258•11 years ago
|
||
We are definitely accumulating files in /mnt/sdcard/Download. This is from a successful test:
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 5Yd4knB1.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 uaUftZeJ.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 GDZl3TIY.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 xeMgoFj4.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 HqvUl4WS.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 QWZDAnX5.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 bqC+BfNB.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 wexVUpAw.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 wXpGAieP.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 46VhywsV.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 PVKgP9Jy.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 oOsGKvBI.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 ADLDnVLR.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 5Tad9nbR.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 cA2Rm20a.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 jK+30ngu.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 66VHXDQs.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 aZHcWUjt.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 VEi6Hr_X.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 SRVKVy6e.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 S+QKLF2V.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 2_ahq+Yv.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 qbtJ93SF.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 YjuBOzox.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 4IJLvj9C.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 mrvNLdfa.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 xZ9v6ukn.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 CdxS+Vcy.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 iK56utt8.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 lar8+_cs.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 Nwqe6Ria.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 4ZNBM8JU.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 MDwQL5Lw.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 WPmShr4G.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 bToMOLpA.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 C+P4QANR.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 FWxTNRSo.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 BHA4HLiH.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 dkFO_2KS.exe.part
----rwxr-x system sdcard_rw 0 2013-02-11 19:40 0vFpcceK.exe.part
----rwxr-x system sdcard_rw 0 2013-02-14 14:16 Ue3LOeEL.exe.part
----rwxr-x system sdcard_rw 0 2013-02-14 14:16 tDhsWBMo.exe.part
----rwxr-x system sdcard_rw 0 2013-02-14 14:16 iVKe3U6e.exe.part
...
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 h4uXU8Em.exe.part
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 TY29rXn5.exe.part
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 HyPWiseU.exe.part
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 QdoQu2CX.exe.part
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 OThUc2Ds.exe.part
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 NEwdcJvN.exe.part
----rwxr-x system sdcard_rw 0 2014-03-07 22:42 wGpF++FZ.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 74jFyXPw.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 vwT5_40h.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 iITZdTbs.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 Rj+VdvnD.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 xGAdDZ6S.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 RyPtA2sg.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 J3uxGTdJ.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 05zyaJwi.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 3WrZgm6a.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 zPeg4+dX.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 EVLP5mVw.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 PaalZzdj.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 qi_zwRlR.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 AsWlXuFi.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 LJYj7Auw.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 kmI8Ij7o.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 yls8_z2Z.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 0orZMO5y.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 E72n_cUD.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 KmY3Vzk_.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 rg+S_zJB.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 Wi9rXls0.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 dXd5tqmB.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 lFY3K3LK.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 pJUiO75W.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 NY1sq2Hw.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 xsacOR3h.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 3rlhGmDs.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 t2oEUZMr.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 mbu1hfZK.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 AYDweYRA.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 WFKQnrld.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 UczXxlgS.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 eLIuIgJR.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 6odYEH56.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 aguE+C39.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 _Vs9PYfI.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 mHstyaK7.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 SqtMFsKy.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 00:29 MhgsAmEX.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 LJWvruQy.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 ippnJzbm.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 YYdNmGGD.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 Sv8GmsMQ.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 YJhJelD4.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 aux_9Q4R.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 _9E2hHqh.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 l2aSGxF0.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 WsSSlToF.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 JNTa1LW1.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 MlO7tljw.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 zK8yzu5+.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 12uHgD4X.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 csTJ084P.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 2lRDARDk.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 2eqbfM6j.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 OXmLzTvc.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 +UzCqCng.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 pogbKJju.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 JSDboIJt.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 XEBbCZdN.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 gaRwbCNS.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 +g6wrkqj.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 h1prNs_o.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 2XA0HglN.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 a1Zaq1k0.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 +9kW5GMn.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 2vZUwfVq.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 EbCy7qSm.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 3LWUTgw4.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 9b9z+V6A.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 hlmizQwi.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 DWr9ttHo.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 zq3Tw9G4.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 s+kZeRRy.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 yLCJYex6.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 FPSwMMiH.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 OwudR2ec.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 tM34uCCA.exe.part
----rwxr-x system sdcard_rw 0 2014-03-08 05:52 mRc_rmzI.exe.part
https://tbpl.mozilla.org/php/getParsedLog.php?id=35842180&tree=Try&full=1
We should probably cleanup /mnt/sdcard/Download in http://mxr.mozilla.org/build/source/tools/sut_tools/cleanup.py.
Flags: needinfo?(gbrown)
Comment 259•11 years ago
|
||
Here's a failed test run: https://tbpl.mozilla.org/php/getParsedLog.php?id=35843259&tree=Try&full=1. On this tegra, "ls -l /mnt/sdcard/Download" failed.
Assignee | ||
Comment 260•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #258)
> We are definitely accumulating files in /mnt/sdcard/Download. This is from a
> successful test:
Aha! That's 16,000 files right there. IIRC FAT32 only allows for 64k files per directory, so it's not surprising if at some point a slave will end up being unable to create files in the download folder.
> We should probably cleanup /mnt/sdcard/Download in
> http://mxr.mozilla.org/build/source/tools/sut_tools/cleanup.py.
Yes, agreed, except...
(In reply to Geoff Brown [:gbrown] from comment #259)
> Here's a failed test run:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=35843259&tree=Try&full=1.
> On this tegra, "ls -l /mnt/sdcard/Download" failed.
hmm, not sure what this means. If we can't ls the folder, how can we clean it up? I don't think we can safely delete and recreate it with the correct owner/permission unless if we have root access on these devices.
Assignee | ||
Comment 261•11 years ago
|
||
Also, FWIW, I verified that attempting to create the file results in ENOSPC. This is the code in the vfat driver which raises this error: <https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/fat/dir.c#n1292> (err is set to -ENOSPC on line 1288 before the loop.)
This confirms the hypothesis that we're running out of maximum entries per directory on the FAT32 filesystem.
Assignee | ||
Comment 262•11 years ago
|
||
(Not sure how to test this patch -- also, not sure if this is going to work for all slaves if we can't list the directory contents.)
Attachment #8388101 -
Flags: review?(gbrown)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 265•11 years ago
|
||
Comment on attachment 8388101 [details] [diff] [review]
Cleanup
Review of attachment 8388101 [details] [diff] [review]:
-----------------------------------------------------------------
This looks fine to me but let's run it by release engineering.
Attachment #8388101 -
Flags: review?(gbrown)
Attachment #8388101 -
Flags: review?(bugspam.Callek)
Attachment #8388101 -
Flags: feedback+
Comment 266•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #259)
> Here's a failed test run:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=35843259&tree=Try&full=1.
> On this tegra, "ls -l /mnt/sdcard/Download" failed.
tegra-250 -- fwiw, this does fail for me too (as does ls `/mnt/sdcard/Download`) however `exec ls /mnt/sdcard/Download` works just fine!
(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #260)
> (In reply to Geoff Brown [:gbrown] from comment #259)
> > Here's a failed test run:
> > https://tbpl.mozilla.org/php/getParsedLog.php?id=35843259&tree=Try&full=1.
> > On this tegra, "ls -l /mnt/sdcard/Download" failed.
>
> hmm, not sure what this means. If we can't ls the folder, how can we clean
> it up? I don't think we can safely delete and recreate it with the correct
> owner/permission unless if we have root access on these devices.
we do have root access, `execsu` with SUTAgent (you can also call many "shell" commands with root=True) e.g. http://mxr.mozilla.org/build/source/tools/sut_tools/mozdevice/devicemanagerSUT.py#304
Comment 267•11 years ago
|
||
Comment on attachment 8388101 [details] [diff] [review]
Cleanup
Review of attachment 8388101 [details] [diff] [review]:
-----------------------------------------------------------------
I'm a little nervous about some of our tegras dropping out of the pool due to this ls bug with the download folder per my previous comment, that said this is indeed a GOOD cleanup to have, and an improvement over today, so I'm happy to have this landed, and I'll deploy to foopies after it does land.
:ehsan feel free to land to hg.m.o/build/tools whenever you're ready, I'll take care of deploying this week (and can even do so sooner if pinged in #releng)
Attachment #8388101 -
Flags: review?(bugspam.Callek) → review+
Assignee | ||
Comment 268•11 years ago
|
||
Landed: https://hg.mozilla.org/build/tools/rev/59fa26f23794
Let's see if this patch manages to clean up some of these tegras. If not, I can find a RelEng bug to pull them off all of the affected ones and clean them up manually.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 272•11 years ago
|
||
Comment on attachment 8388101 [details] [diff] [review]
Cleanup
I'm hesitant to call this bug fixed until we see the benefits of it, however, it has been deployed
Attachment #8388101 -
Flags: checkin+
Comment 274•11 years ago
|
||
I landed a followup for this, since I never tested it as advised and it broke the whole pool.
http://hg.mozilla.org/build/tools/rev/938c20dcd0dc
Also since many tegras wouldn't work with the ls issue mentioned above:
Fixing those ls issues was me doing a rmdr on /mnt/sdcard/Download manually. We might want to change the patch to do that in the future, but for this I expect to be good.
Comment 275•11 years ago
|
||
According to orangefacter this looks fixed, :gbrown concur?
Flags: needinfo?(gbrown)
Comment 276•11 years ago
|
||
There is absolutely no way that orangefactor could tell you that it's fixed: the bug was that a set of tegras with 16K files in their Download directory could not successfully run mochitest-8, and the initial patch took every single one of those affected slaves out of action all day. Orangefactor telling you that it's fixed would consist of using OF to get the list of affected slaves, using https://secure.pub.build.mozilla.org/buildapi/recent/tegra-237 to see what they've done recently, and filtering on mochitest-8 to see what they've done in the way of this suite. None of them have yet completed a job since this morning, so none of them have had a chance to say whether they are fixed.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 278•11 years ago
|
||
(In reply to TBPL Robot from comment #277)
> philor
> https://tbpl.mozilla.org/php/getParsedLog.php?id=35919213&tree=Mozilla-
> Inbound
> Android 2.2 Armv6 Tegra mozilla-inbound opt test mochitest-8 on 2014-03-10
> 19:30:12
> revision: 38ecb36a4f52
> slave: tegra-136
>
> TEST-UNEXPECTED-FAIL |
> /tests/uriloader/exthandler/tests/mochitest/test_unsafeBidiChars.xhtml |
> application timed out after 330 seconds with no output
Uh-oh, is this after the fix?
Assignee | ||
Comment 279•11 years ago
|
||
Hmm, yeah. Can someone please see why tegra-136 is still affected?
Comment 280•11 years ago
|
||
this one had no /mnt/sdcard/Download -- Might have been leftover from me attempting a fix elsewhere :/
I just created it with exec mkdir /mnt/sdcard/Download
I'll look into ways to enforce/ensure that
Comment 281•11 years ago
|
||
No failures yet today is a bit more encouraging, though :)
Assignee | ||
Comment 282•11 years ago
|
||
Reverted the debugging stuff: https://hg.mozilla.org/integration/mozilla-inbound/rev/5ef71c50623d
I think we're done here. Thanks everyone!
Whiteboard: [leave open]
Comment 283•11 years ago
|
||
Whew!
There may still be a problem with some of the affected tegras not taking jobs / not mounting their sdcards...I understand :Callek is working on that.
Flags: needinfo?(gbrown)
Comment 284•11 years ago
|
||
This was merged.
https://hg.mozilla.org/mozilla-central/rev/5ef71c50623d
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
status-firefox28:
--- → fixed
status-firefox29:
--- → fixed
status-firefox30:
--- → fixed
status-firefox-esr24:
--- → unaffected
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•