Open Bug 1333245 Opened 8 years ago Updated 2 years ago

Javascripts can't finish loading since version 49.0

Categories

(Core :: DOM: Core & HTML, defect, P2)

49 Branch
x86_64
Windows
defect

Tracking

()

Tracking Status
firefox51 - wontfix
firefox52 - wontfix
firefox-esr52 --- wontfix
firefox53 + wontfix
firefox54 + wontfix
firefox55 + wontfix
firefox56 - fix-optional

People

(Reporter: songzhang2008, Unassigned)

References

Details

(Keywords: regression)

Attachments

(10 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36 Steps to reproduce: The problem is found since version 49.0 Our product has a index.jsp file which will be called after user login. There are over 200 Javascript files in this index.jsp loading. The loading won't finish since Firefox 49.0. Other browsers like IE, Edge, Chrome work normally. Firefox 48.0 works fine. Actual results: The Javascripts in the jsp file won't finish loading. Each time, the loading hangs at different Javascript file. Expected results: All of the Javascripts in the jsp file need to loaded during user login process.
Summary: javascripts can't finish loading since version 49.0 → Javascripts can't finish loading since version 49.0
Platform: Windows Server 2012 R2 and Windows10
Component: Untriaged → General
OS: Unspecified → Windows
Hardware: Unspecified → x86_64
Hi Reporter [songzhang2008], Loading Javascript works for me but I do not have that much [200] files. Could provide regression range using mozregression tool? Usage and installation info about Mozregression is attached. Additional information is also available here: http://mozilla.github.io/mozregression/ You could also attach test javascript files. Thanks
Flags: needinfo?(songzhang2008)
Component: General → JavaScript Engine
Product: Firefox → Core
Thank you, I will try the above tool. There are about 300 js files need to be loaded inside the jsp file, unfortunately, I can't post it here because of the company code. All of the js files are located in local disk. It is using the following <script type="text/javascript" src="js directory/jsname.js"></script>
js files are with ExtJs 4.2
It would be great to provide a testcase, even with fake/random js files.
Keywords: testcase-wanted
Run mozregression on Win2012R2 64bit The problem is found from relese49.0 (downloading from the following) https://ftp.mozilla.org/pub/firefox/releases/49.0/win64/en-US/ release 48.0 doesn't have the problem. However, run the mozregresssion with good=48 and bad=49, both are working well. $ mozregression --good=48 --bad=49 Any recommendation on the range of mozregression?
It is hard to provide the test case without the source code Our product has login page, after user enters valid username/password, the page will call index.jsp which contains over 300 js files loading. The main page will be displayed after all js files loading. It is found since FF49.0, the js files loading hangs and the user can't finish login. The hang place varies each time.
(In reply to songzhang2008 from comment #6) > Run mozregression on Win2012R2 64bit > The problem is found from relese49.0 (downloading from the following) > https://ftp.mozilla.org/pub/firefox/releases/49.0/win64/en-US/ > > release 48.0 doesn't have the problem. > > However, run the mozregresssion with good=48 and bad=49, both are working > well. > $ mozregression --good=48 --bad=49 > > Any recommendation on the range of mozregression? Enlarge the bad boundary, just run "mozregression --good=49", no need to specify the bad build.
$ mozregression --good=49 The result shows build 2017-01-25 is bad ============================== administrator@HPCASERVER ~ $ mozregression --good=49 ********** You should use a config file. Please use the --write-config command line flag to help you create one. ********** 0:01.69 INFO: bits option not specified, using 64-bit builds. 0:01.69 INFO: No 'bad' option specified, using 2017-01-25 0:01.69 INFO: Using date 2016-06-06 for release 49 0:09.16 INFO: Testing good and bad builds to ensure that they are really good and bad... 0:09.16 INFO: Downloading build from: https://archive.mozilla.org/pub/firefox/nightly/2016/06/2016-06-06-13-13-41-mozilla-central/firefox-50.0a1.en-US.win64.zi p ===== Downloaded 100% ===== 0:19.22 INFO: Running mozilla-central build for 2016-06-06 0:20.61 INFO: Launching c:\users\admini~1.hpc\appdata\local\temp\2\tmptazjgj\firefox\firefox.exe 0:20.63 INFO: Application command: c:\users\admini~1.hpc\appdata\local\temp\2\tmptazjgj\firefox\firefox.exe -profile c:\users\admini~1.hpc\appdata\local\temp\2 \tmpjyofxp.mozrunner 0:20.66 INFO: application_buildid: 20160606131341 0:20.66 INFO: application_changeset: 396b577b98e64fe95223cfce095f84fe0a67da01 0:20.66 INFO: application_name: Firefox 0:20.66 INFO: application_repository: https://hg.mozilla.org/mozilla-central 0:20.66 INFO: application_version: 50.0a1 Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry' or 'exit' and press Enter): good 0:48.73 INFO: Using local file: c:\users\admini~1.hpc\appdata\local\temp\2\tmpu9pdlx\2017-01-25--mozilla-central--firefox-54.0a1.en-US.win64.zip (downloaded in background) 0:48.73 INFO: Running mozilla-central build for 2017-01-25 0:50.13 INFO: Launching c:\users\admini~1.hpc\appdata\local\temp\2\tmpba96g2\firefox\firefox.exe 0:50.13 INFO: Application command: c:\users\admini~1.hpc\appdata\local\temp\2\tmpba96g2\firefox\firefox.exe -profile c:\users\admini~1.hpc\appdata\local\temp\2 \tmpjls97c.mozrunner 0:50.15 INFO: application_buildid: 20170125030214 0:50.16 INFO: application_changeset: 6dccae211ae5fec6a1c1244b878ce0b93860154f 0:50.16 INFO: application_name: Firefox 0:50.16 INFO: application_repository: https://hg.mozilla.org/mozilla-central 0:50.16 INFO: application_version: 54.0a1 Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry' or 'exit' and press Enter):
You have to make the run, for each build launched by mozreg, you make your test and type if it's bad or good. The console will give you a pushlog from mozilla-inbound repo.
Thanks, just realize it. The following is the last result: Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): good 23:30.24 INFO: Narrowed inbound regression window from [559f31d8, d1722ce2] (3 revisions) to [4073f541, d1722ce2] (2 revisions) (~1 steps left) 23:30.24 INFO: Oh noes, no (more) inbound revisions :( 23:30.24 INFO: Last good revision: 4073f541e59556b7f9a8e0d7078165dd74050863 23:30.24 INFO: First bad revision: d1722ce2542a2dcc6d351da7481640aceb44f894 23:30.24 INFO: Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=4073f541e59556b7f9a8e0d7078165dd74050863&tochange=d1722ce2542a2dcc6d351da7481640aceb44 f894 23:30.81 INFO: Looks like the following bug has the changes which introduced the regression: https://bugzilla.mozilla.org/show_bug.cgi?id=1279633
Thanks for the range. Nick, any idea about this issue caused by your patch?
Blocks: 1279633
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(songzhang2008) → needinfo?(nfitzgerald)
Version: 49 Branch → 50 Branch
Are we sure that this regression range is correct? I really don't see how that patch could have introduced such behavior.
Flags: needinfo?(nfitzgerald)
Also, we need a proper test case and STR to investigate this bug.
I will re-run mozregression to double confirm the regression changeset
The issue is found since FF release version 49. Version 48 is good. I guess the above result is incorrect because I was wrongly using $ mozregression --good=49 --> should be 48 Not sure why the test still shows some working and some not, based on which I got the above incorrect result. Try the following this time: $ mozregression --good=48 The result is as following: ============================= 14:55.77 INFO: Using local file: c:\users\admini~1.hpc\appdata\local\temp\2\tmp0duqgc\2016-09-12--mozilla-central--firefox-51.0a1.en-US.win64.zip (downloaded in background) 14:55.77 INFO: Running mozilla-central build for 2016-09-12 14:57.21 INFO: Launching c:\users\admini~1.hpc\appdata\local\temp\2\tmpyrwokh\firefox\firefox.exe 14:57.23 INFO: Application command: c:\users\admini~1.hpc\appdata\local\temp\2\tmpyrwokh\firefox\firefox.exe -profile c:\users\admini~1.hpc\appdata\local\temp\2 \tmpmnjmye.mozrunner 14:57.24 INFO: application_buildid: 20160912030421 14:57.24 INFO: application_changeset: cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 14:57.26 INFO: application_name: Firefox 14:57.26 INFO: application_repository: https://hg.mozilla.org/mozilla-central 14:57.27 INFO: application_version: 51.0a1 Was this nightly build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): bad 15:45.00 INFO: Narrowed nightly regression window from [2016-09-11, 2016-09-13] (2 days) to [2016-09-11, 2016-09-12] (1 days) (~0 steps left) 15:45.00 INFO: Got as far as we can go bisecting nightlies... 15:45.00 INFO: Last good revision: 7e873393cc11d326338779e5a3ed2da031e30936 (2016-09-11) 15:45.00 INFO: First bad revision: cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 (2016-09-12) 15:45.00 INFO: Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7e873393cc11d326338779e5a3ed2da031e30936&tochange=cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 15:45.01 INFO: Switching bisection method to taskcluster 15:45.01 INFO: Getting mozilla-central builds between 7e873393cc11d326338779e5a3ed2da031e30936 and cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 15:47.34 INFO: Oh noes, no (more) inbound revisions :( 15:47.34 INFO: Last good revision: 7e873393cc11d326338779e5a3ed2da031e30936 15:47.34 INFO: First bad revision: cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 15:47.34 INFO: Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7e873393cc11d326338779e5a3ed2da031e30936&tochange=cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 15:48.34 INFO: ************* Switching to fx-team by process of elimination (no branch detected in commit message) 15:48.34 INFO: Getting fx-team builds between 8d2a514d0f23060d95b4f705c98ed37baff46422 and cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 15:50.55 INFO: Oh noes, no (more) inbound revisions :( 15:50.55 INFO: Last good revision: 8d2a514d0f23060d95b4f705c98ed37baff46422 15:50.55 INFO: First bad revision: cfdb7af3af2e92e95f71ca2f1672bf5433beeb89 15:50.55 INFO: Pushlog: https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=8d2a514d0f23060d95b4f705c98ed37baff46422&tochange=cfdb7af3af2e92e95f71ca2f1672bf5433beeb89
Attached file full log of mozregression (deleted) —
The attached is the full log output of the mozregression: $ mozregression --good=48
The thing confuses me is that when I run the same mozregression command in the same environment again, I got different results. For example, 2016-09-11 build passed test on the above test. But it fails when i run the command 2nd time.
No longer blocks: 1279633
That's why we need testcase.
Attached file log output for 2nd run mozregression (deleted) —
The 2nd run of mozregression in the same environment shows different results.
If you're getting inconsistent results, you might try re-testing the same build a few times before answering good or bad. You can use the 'r' option to retry on the same build. Thanks for your patience and continued testing!
Thank you for the recommendations! The issue must between formal release 48 and 49. We have tried several times of version 48 on different machines and all of them working. It is hard to provide the test case. The index.jsp contains about 300 js files which are based on ExtJS. Some of the js files will initiate ajax call to backend java server. All of them are tightly bound.
Tries more with "retry". Hope it helps. Please provide more recommendations if this still not make sense. The followings are found: 1. Some build fails at first, but works with 1 retry 2. Some build only works after several times of retry 3. From 2016-06-23, more and more retries before the test working 4. Following mozregression instruction, 20160622050301 begins the unstable. The UI goes a little bit further than those fully fail, but still fails without showing the first page after login. 20160622044701 works most of time. One time failure found and has the same UI result as 20160622050301 5. Is 2016-04-25 the build date for official release version 48? This always works without any retry. ==================== 72:06.33 INFO: application_buildid: 20160622050301 72:06.33 INFO: application_changeset: 40969d155419f72fa31e5b0e24158eb1fc10238c 72:06.33 INFO: application_name: Firefox 72:06.33 INFO: application_repository: https://hg.mozilla.org/mozilla-central 72:06.33 INFO: application_version: 50.0a1 Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): retry 72:48.06 INFO: Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4e17dca08962df106065f09026954efb7e249639&tochange=c9edfe35619f69f7785776ebd19a3140684024dc 72:48.06 INFO: Running mozilla-central build built on 2016-06-22 13:45:05.715000, revision 40969d15 72:49.30 INFO: Launching c:\users\admini~1.hpc\appdata\local\temp\2\tmp0rlcem\firefox\firefox.exe 72:49.31 INFO: Application command: c:\users\admini~1.hpc\appdata\local\temp\2\tmp0rlcem\firefox\firefox.exe -profile c:\users\admini~1.hpc\appdata\local\temp\2 \tmphbtwz0.mozrunner 72:49.33 INFO: application_buildid: 20160622050301 72:49.33 INFO: application_changeset: 40969d155419f72fa31e5b0e24158eb1fc10238c 72:49.33 INFO: application_name: Firefox 72:49.33 INFO: application_repository: https://hg.mozilla.org/mozilla-central 72:49.34 INFO: application_version: 50.0a1 Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): bad 73:24.44 INFO: Narrowed inbound regression window from [4e17dca0, c9edfe35] (3 revisions) to [4e17dca0, 40969d15] (2 revisions) (~1 steps left) 73:24.44 INFO: Oh noes, no (more) inbound revisions :( 73:24.46 INFO: Last good revision: 4e17dca08962df106065f09026954efb7e249639 73:24.46 INFO: First bad revision: 40969d155419f72fa31e5b0e24158eb1fc10238c 73:24.46 INFO: Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4e17dca08962df106065f09026954efb7e249639&tochange=40969d155419f72fa31e5b0e24158eb1fc10238c 73:25.19 INFO: Looks like the following bug has the changes which introduced the regression: https://bugzilla.mozilla.org/show_bug.cgi?id=1280476
IMHO, all these regression ranges are wrong because the issue is intermittent and not easy to reproduce. Honestly, without testcase, it will be hard to help and debug.
The issue is not intermittent although not consistent in some build. The test result in version 48 is always good and version 49 always bad. Test case is hard unless install the whole product.(tightly bound) We may need some more time/tests to narrow down the problematic build. Can we download each nightly build installer (instead of mozregression) so that more detailed tests can be run. What is the build number for version 48 and version 49? Thank you!
If you need to launch a build, you can use "mozregression --launch=YYYY-MM-DD". Nightlies are available on the FTP (installer and zipped standalone versions): http://ftp.mozilla.org/pub/firefox/nightly/(each day has a repo mozilla-central) Release dates are hardcoded in the file releases.py: https://github.com/mozilla/mozregression/blob/1ae1995e67d1bb641bfb214a17448b4d3c3f22f0/mozregression/releases.py
When I download the nightly build installer form the above FTP site, I found "aurora/autora-l10n/beta/central/centra-l10n" etc for each date. Can I only use central-l10n from different date for this test?
Sorry, didn't see the above note. I only need to use the mozilla-central under each day for test?
You can use the repo mozilla-central for each date, it's the default repo for Nightly (in en-US by default). mozilla-aurora is the repo for the Developper Edition (currently FF53). l10n is about localization.
We have some further tests using the windows-64 installer under repo mozilla-central for each date (nightly build). Each installer is tried with several times because of the inconsistency. It looks it is 2016-05-24-07-37-14-mozilla-central/, the nightly build begins to show unstable. The tests sometime fail and sometime good. The nightly build before it are always good. Could you please help check if this has some suspicious changeset, thank you!
From your tests: m-c 2016-05-23=16663eb3dcfa759f25b5e27b101bc79270c156f2 2016-05-24=829d3be6ba648b838ee1953fdfa1a477dace752f https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=16663eb3dcfa759f25b5e27b101bc79270c156f2&tochange=829d3be6ba648b838ee1953fdfa1a477dace752f The changelog is still huge. As you have the last good build and the first bad build, could you run: mozregression --repo=mozilla-inbound --good=2016-05-23 --bad=2016-05-24
Our test result shows last good build is 2016-05-24-03-02-27-mozilla-central/ The first bad build is 2016-05-24-07-37-14-mozilla-central/ Are there still hug changeset between them?
Version: 50 Branch → 49 Branch
Both 2016-05-24-03-02-27-mozilla-central and 2016-05-24-07-37-14-mozilla-central/ have the following from console: 09:02:50.052 Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user’s experience. For more help http://xhr.spec.whatwg.org/ ext-all.js:18:71881 09:02:50.715 Using //@ to indicate sourceURL pragmas is deprecated. Use //# instead[Learn More] ext-all.js:784 09:02:51.472 Using //@ to indicate sourceURL pragmas is deprecated. Use //# instead[Learn More] ext-all.js:110 09:02:51.537 Using //@ to indicate sourceURL pragmas is deprecated. Use //# instead[Learn More] ext-all.js:126
Yes, bug 1267989 looks a good culprit.
Blocks: 1267989
Component: JavaScript Engine → DOM
So.. There was one known situation in which the fix for bug 1267989 could cause those symptoms. See bug 1281366. So coming down to a regression range that includes bug 1267989 on its own doesn't really say much, since there was a month or so between it landing and bug 1281366 being fixed when that sort of behavior could be expected. So the first question: did the behavior being discussed here change observably at all across bug 1281366 being fixed? Second question: I know you have said that it's not possible to provide a testcase. If you want, we can try to debug whatever is going on via me creating test builds with extra logging and you trying them and reporting back on what's logged. I would be assuming that you can see output to stderr (e.g. on Windows by starting Firefox with the text console visible). If we want to go this route, what operating system do you want builds for? And is the problem reproducible on current nightly (so I can basically base these builds on tip), or do we need builds off a particular branch here?
Flags: needinfo?(bzbarsky) → needinfo?(songzhang2008)
Thanks! We are using windows2012R2 win64 Which nightly build is 1281366 fix firstly included? Our test fails with official release 49 and most of the build after 49 We found some build after 49 occasionally works. This is found when we firstly uses mozregression tool. Our current tests are not using mozregression tool. We download the nightly build installer and install on local machine to teste I downloaded latest nightly from http://ftp.mozilla.org/pub/firefox/nightly/2017/02/2017-02-01-05-27-15-mozilla-central/ just now and it is not working We can collect debug info with test build. Please help instruct the steps
> Which nightly build is 1281366 fix firstly included? 2016-06-24, I would think. > I downloaded latest nightly from http://ftp.mozilla.org/pub/firefox/nightly/2017/02/2017-02-01-05-27-15-mozilla-central/ just now and it is not working OK, thanks. I'll create a build that does some logging to stderr and post the link there. If you can reproduce the problem in that build, save the log, and attach it to this bug using https://bugzilla.mozilla.org/attachment.cgi?bugid=1333245&action=enter then we'll see where to go from there.
Could you tell me how to use the test binary? unzip the zip file and run the firefox.exe from windows cmd?
I got a Mizilla Crash Reporter box popup either run the firefox.exe from cmd or double click the firefox.exe in the unzipped dir. Both for win32 and win64 zip files
Er, yes. More builds coming up that won't have that problem...
I use win64 generally. Thanks!
Attached image test1 startup (deleted) —
Testing https://archive.mozilla.org/pub/firefox/try-builds/bzbarsky@mozilla.com-78b6040b0aa81a48c152900335efae864e1283cd/try-win64/firefox-54.0a1.en-US.win64.zip When double click firefox.exe, the attached Test1.jpg shows the "Nightly Safe Mode" popup windows. Should I continue with clicking one of the buttons?
Attached file Test1 console output (deleted) —
When running firefox.exe from windows cmd, the command returned immediately and firefox startup. The Test1.txt shows the console output
You need to quit your existing Firefox, presumably, if you have one running already. Or pass "-P someotherprofilename" to the Firefox you're starting from the command line. Otherwise it will just find the already-running instance and hand off to it.
Strange, I don't have any other FF running when I tested the test build. Should I uninstall current FF (not running) first?
No, if you didn't have anything else running things should be fine. Again, you need to actually get the stderr from Firefox, not the web console. I _think_ "firefox.exe -console" will do the right thing, but it's been a long time since I've had to deal with this on Windows...
Alternately, "firefox.exe -attach-console", or maybe even "firefox.exe -attach-console > log.txt 2>&1" to get it into a file, assuming that still works on modern Windows.
Attached file stderr output from test build (deleted) —
The attached is the output from the command: c:\FFTest\firefox>firefox.exe -attach-console > log.txt 2>&1 c:\FFTest\firefox>[GPU 8956] WARNING: pipe error: 109: file c:/builds/moz2_slave/try-w64-0000000000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_cha nnel_win.cc, line 346 [GPU 8956] WARNING: pipe error: 109: file c:/builds/moz2_slave/try-w64-0000000000000000000000/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
Hmm. Looks like the _content_ process didn't get redirected to the same console. :( Would you mind redoing that, but after either turning off e10s? Or I guess just loading the page in question in a non-e10s window?
Please tell me who to turn off e10s or steps in a non-e10s window. Still using the same command? Thanks!
Don't know if the following steps are correct 1. Toos->Options->General->Startup, uncheck "Enable multi-process Nightly" 2. Restart FF as required 3. Go to about:config and found browser.tabs.remote.autostart.2 "status" is changed to "user set", value is "false" but this preference value is false by default, so not sure if the above is correct
The simplest way to test in a non-e10s window is like this: 1) Start Firefox with the -attach-console stuff and logging to log.txt. 2) Right-click on the one visible tab you have. 3) Select "Open in new non-e10s window" from the context menu. 4) Load the page in the resulting window.
Attached file test log with turning off e10s (deleted) —
Attach the log output with turning off e10s window
Comment on attachment 8833011 [details] test log with turning off e10s I assume that at some point during this browsing session you reproduced the bug? What was the URL of the page demonstrating it (so I know which part of the log is relevant)? Do you happen to know specific scripts that did not run that you expected to run?
Sorry to reply late. With current log, it is stuck almost the same place each time. However, if I comment out the js file in the index.jsp (the one FF is loading), it is stuck at a new place.
I see two loads of "index.jsp" in that log. Is the relevant one the first one or the second one?
And is it expected that some of the _scripts_ are being loaded from index.jsp too? Anyway, I am going to do another push with a bit more logging added, because there are some cases that the logging isn't as clear on as I'd like....
Attached file log file with the recent test build (deleted) —
Attach log.txt collected with the above test build. Thanks!
> Attach log.txt collected with the above test build. Thank you. Do you happen to know which file failed to run for you in that case? What I see in the log is 257 script elements going through the script loader on http://localhost:8080/HPCA-Admin/index.jsp and all 257 are being executed...
Most of time, it is stuck at http://localhost:8080/HPCA-Admin/getRetentionPolicies.ajax which is in the <script type="text/javascript" src="js/autonomy/aca/admin/data/DataStores.js"></script> The app log shows that the ajax is returned with the expected result. However, if I comment it (Datastores.js) out in index.jsp, it will stop at other js files.
I'm not sure I quite follow... The log shows three separate loads of documents with URL http://localhost:8080/HPCA-Admin/getRetentionPolicies.ajax?something, but none of them do any script loading. It also shows DataStores.js being run. Maybe my question was unclear. What I am asking is this: on the specific run that produced the log, which scripts did you observe not loading? I'm trying to figure out whether the problem is that we have pending script loads we never run, or whether the problem is that the loads never make it to the pending state, or something else. All this without having any idea of what your actual page looks like and what the expected behavior (e.g. in terms of number of scripts that should load) is....
Marking this fix-optional for 52, though we can still take a patch in the next week or so.
I didn't find any js file contained in the index.jsp not loading from the firefox log file. However, compare FF and Chrome (and other browsers) app log (app log is our internal backend log which can show the ajax request from browser), I can see no more request after http://localhost:8080/HPCA-Admin/getRetentionPolicies.ajax when using FF (the getRetentionPolicies.ajax is successful). When using Chrome for example, there are many other ajax call after getRetentionPolicies.ajax.
OK. It's really hard to say anything here without knowing what the scripts involved (the ones you expect to be making those XHR requests) look like or are trying to do... Is the problem that some <script> tags are not ending up executing? That's what I _thought_ you were saying, but it sounds like that's not the case? Is the problem that some XHR requests are never kicked off, though they should be? Or that they never hit the server? Comment 67 seems to be saying that, right?
Yes, some of the XHR never hit the server compared to the log with other browsers. I doubt if those XHR (not hit the server) are kicked off from the browser because both the browser and the Java server are running on the same machine under Tomcat. I am not sure if it is related to the JS scripts that contains the XHR or because there are too many JS script with XHR loading. Another application under the same Tomcat but with less JS scripts (not the same scripts) is working with FF. I can email you the index.jsp if you would like (with your email address).
I doubt it's a matter of "too many scripts" per se. The issue presumably has to do with the exact timing of things. If you can email me a URL I can load to reproduce the problem, along with enough information to tell whether I'm seeing the problem or not, that would be extremely helpful. Just the index.jsp file, without the relevant scripts and server bits is not likely to help too much. I can also add logging for XHR stuff, but I'd really like to have a clear explanation of what the problem actually is to guide that logging, assuming we have that, because I obviously already misunderstood it once as "scripts are not loading and running", whereas it seems like they are.... Thank you for your patience and help with this; I know this must be quite frustrating.
Thanks a lot for your help. Because we can't abstract the issue without the env. What about an online meeting with screen sharing?
That might be worth doing just so it's clear what the problem is. I'm in US/Eastern time and I could do this sometime during work hours tomorrow. Failing that it would have to be next week.
Thank you! Let me arrange the online meeting. It would be next week and could you leave me your email address so that I will send the meeting invitation.
OK, so per discussion, here are a few more builds to try: 1) A build from right before bug 1267989 was checked in. This is mostly to check that it does in fact work so we're not chasing red herrings. It doesn't do any logging. https://archive.mozilla.org/pub/firefox/try-builds/bzbarsky@mozilla.com-96fdf1ed8135cd19370ac748547f0e074bcf2a53/try-win64/firefox-49.0a1.en-US.win64.zip 2) A build from right after bug 1267989 was checked in, with the patch from bug 1281366 applied. This is to check whether that build shows the problem. Again, no logging. https://archive.mozilla.org/pub/firefox/try-builds/bzbarsky@mozilla.com-c2f3a382a2a568c0dbe7d25c8f21f87d4ac67049/try-win64/firefox-49.0a1.en-US.win64.zip 3) A build from tip that logs all the <script> executions and XHR send() calls it's doing. A log from this would be appreciated. https://archive.mozilla.org/pub/firefox/try-builds/bzbarsky@mozilla.com-64bbf381d554c4fb1852cee45d1a8caf23d9e276/try-win64/firefox-54.0a1.en-US.win64.zip
Test1: successful Test2: fail Test3: refer to attached logTest3.txt Test4: refer to attached logTest4.txt
Attached file logTest3.txt, test3 log (deleted) —
Test3 log output
Attached file logTest4.txt, Test4 log (deleted) —
Test4 log output
Test3 is made with non-e10s window However, test4 doesn't have the "Open in new non-e10s window" option
If we come up with a fix for this, we could consider uplift, but for now I'm marking this fix-optional for 53, since 53 is about to go into beta.
Flags: needinfo?(bzbarsky)
Hi, Please let me know if I can do anything further on this issue. Could you please help update the current status? Thanks a lot!
Flags: needinfo?(songzhang2008)
Sorry, I was pretty swamped with a few things, but looking at this again now. Thank you for the logs. The "test4" build only has non-e10s windows, which is why it doesn't have that option. So the most interesting thing in the logs is that the non-working (test3) build has the following sync XHRs happening: Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/mailServices/imap/IMAPSettings.js?_dc=1488214949346 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/retentionPolicies/RetentionPolicyGeneral.js?_dc=1488214949397 while the working (test4) build has the following: Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/mailServices/imap/IMAPSettings.js?_dc=1488215343173 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/retentionPolicies/RetentionPolicyGeneral.js?_dc=1488215343223 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/ext/ux/grid/filter/DateTimeFilter.js?_dc=1488215343423 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/usersAndGroups/groups/GroupSearchGeneral.js?_dc=1488215344016 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/usersAndGroups/groups/GroupGeneral.js?_dc=1488215344023 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/usersAndGroups/ldapSettings/LdapSettingsGeneral.js?_dc=1488215344081 Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/usersAndGroups/ldapMappings/LdapGroupMappingsOptions.js?_dc=1488215344087 so in particular the XHR for "DateTimeFilter.js" never happened (nor did the other ones after it). Are you able to check which script is expected to make that XHR request? Trying to figure out whether that script ended up executing or not...
Flags: needinfo?(bzbarsky) → needinfo?(songzhang2008)
And in particular, both logs show the same set of scripts under "js/ext/" running. Test4 shows 253 non-js/ext scripts running, while test3 shows only 183, which is presumably because some of those XHRs never happened. In addition to the sync XHRs I list above, there are some async ones that are missing from test3 as well. Starting with an async XHR to "http://localhost:8080/HPCA-Admin/getRetentionPolicies.ajax?_dc=1488215344293&page=1&start=0&limit=25" (one of three such to getRetentionPolicies.ajax)...
Mark 54 as fix-optional but still happy to have the fix in 54.
Boris: As to your question, I searched the whole application directory and can't find explicitly calling DateTimeFilter.js in our code. I suspect it may be called somewhere by the ExtJS. It is from js/ext/ux/grid/filter/DateTimeFilter.js. Our application uses ExtJS. All ExtJS code are put under /js/ext/. Thanks,
Flags: needinfo?(songzhang2008)
OK. I'll look into what extjs is doing; hopefully the version on their site matches yours. What about js/autonomy/aca/admin/usersAndGroups/groups/GroupSearchGeneral.js? Is that loaded by one of your scripts, and if so which one?
Flags: needinfo?(songzhang2008)
Hmm. Doesn't look like extjs is exactly available for download. Do you mind grepping for DateTimeFilter.js under js/ext to see what files are loading it?
We use ExtJS4.2.2 GroupSearchGeneral.js may be required by GroupSearchPanel.js which is called by js/autonomy/aca/admin/usersAndGroups/groups/GroupPanel.js js/autonomy/aca/admin/usersAndGroups/groups/GroupPanel.js is in the index.jsp where the over 200 js files are loaded
Flags: needinfo?(songzhang2008)
Ext.define('Ext.ux.grid.filter.DateTimeFilter', { extend: 'Ext.ux.grid.filter.DateFilter', alias: 'gridfilter.datetime',
js/ext/ux/grid/FiltersFeature.js
and js/ext/ux/grid/FiltersFeature.js is called in autonomy\aca\shared\utils\i18n-ux.js which is also loaded from index.jsp
Priority: -- → P2
Thank you. OK, both logs show FiltersFeature.js. Did some more digging, and here are a few more interesting things: 1) In the "not working" log, the last line of output that is related to the site is: Executing script: http://localhost:8080/HPCA-Admin/js/ext/ux/grid/filter/DateTimeFilter.js?_dc=1488214949544 In the "working" log, there is never an "Executing script" line for that URL. Instead there is: Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/ext/ux/grid/filter/DateTimeFilter.js?_dc=1488215343423 I'm not sure what Ext does with the sync XHR result. I would have thought it sticks it in an inline script, but I don't see an inline script execution immediately after that sync XHR log line in the "working" log. Maybe that's because on 48 we have precisely the bug we fixed where we keep running other scripts during a sync XHR. Anyway, the difference here is that in the "working" case there's a sync load of DateTimeFilter.js, but in the "not working" case it's an async load... but in either case DateTimeFilter.js gets run, presumably. 2) Some of the URLs in the two logs have different versions. For example, the "working" log shows a script running from http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/mailServices/exchange/ExchangeRules.js?_dc=1488215343142 but the "non-working" log shows a script from <http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/mailServices/exchange/ExchangeRules.js?_dc=1488214949309>. Maybe this is just because this is a live system and a deploy happened in between the two logs being collected... 3) FiltersFeature.js gets executed in both cases, as does i18n-ux.js; both via script execution, not sync XHR. So presumably the DateTimeFilter load succeeded in both cases, even though it was sync in one case and async in the other. 4) The working log shows execution of GroupPanel.js and GroupSearchPanel.js and a sync XHR for GroupSearchGeneral.js, but shows _nothing_ for index.jsp, not even an XHR. The "non-working" log shows index.jsp executing (twice, actually!), but nothing for GroupPanel.js, GroupSearchPanel.js, or GroupSearchGeneral.js. This part is not making sense to me so far. 5) The non-working log doesn't show any inline script execution at all, even though it has two sync XHRs: one for IMAPSettings.js and one for RetentionPolicyGeneral.js. The non-working log also shows a script execution for RetentionPolicyGeneral.js, but for a different version from the one the sync XHR was for: "RetentionPolicyGeneral.js?_dc=1488214949397" for the sync XHR, but "RetentionPolicyGeneral.js?_dc=1488214948608" for the script execution... The "working" log also shows two things for RetentionPolicyGeneral.js (with different versions from each other), one sync XHR and one script execution, but with different versions from the non-working log. So some questions: A) Do you happen to know whether your version of Ext uses inline script to execute the results of its sync XHRs? B) If you examine the page in devtools, is there any mention of numPendingFiles in the scripts? If so, what does its value look like in the "not working" case once network activity stops? C) Any idea why DateTimeFilter.js is being sync-loaded by the page in one case but not the other? D) Which things are depending on ImapSettings.js? Is it possible that the key difference is that ImapSettings.js doesn't get executed?
Flags: needinfo?(songzhang2008)
Sorry, I can't answer the questions because of my limited knowledge of ExtJS internally. B) there is no numPendingFiles mentioned as I remember. I am also not sure if our current direction is correct or not. The js file that stop loading may be random although most of time we can see it is around Doing sync XHR to: http://localhost:8080/HPCA-Admin/js/autonomy/aca/admin/retentionPolicies/RetentionPolicyGeneral.js?_dc=1488214949397
Flags: needinfo?(songzhang2008)
> I am also not sure if our current direction is correct or not. Right, I'm not either. Having no way to reproduce this myself makes this pretty difficult to debug. :(
Understand this. If you need webex sharing, let me know.
What would help most is some way that I could run a built directly against the site involved, and debug that build... I understand that can be difficult to set up. :(
we can set up a Webex and run a debug build against the application.
It's not just running a debug build; you'd have to do the build yourself (so you have source and symbols around), run it under a debugger (ideally on Linux so we can use rr as the debugger), be willing to recompile and run again as needed (though I'd hope that rr would make this a nonissue)... This is probably doable with a large enough time investment from both me and you. Especially if you _are_ willing to record a failing session in rr on Linux and then either give me access to the relevant Linux machine so I can run rr replay myself or let me drive it through webex.
Our support engineer will provide public-facing installation site for debug. Could you tell us the public IP range to open for you?
You mean the IP range my incoming connection will be coming from? This week I'm traveling, so I have no idea what my IPs will be. Next week I should probably be coming from 108.7.77.*; I'm not sure how my ISP assigns IPs exactly, so not sure whether I could end up outside that range.
Hi, Boris: Please let me know when you are ready to test against our environment so that I can arrange. Thanks!
Let's aim for me starting on Monday morning? That gives me time to clear up schedule space for this and gives you time to set things up.
Does 10:30 AM Eastern time Monday morning work? If so, I’ll prepare a Webex and bridge.
Can we reach you via bzbarsky@mit.edu? Thanks!
Yes, that works for me. Thank you! And yes, that address is good to use.
Hi, Boris: Please let us know if you need any more info from us, or if you need to access the site again? Thanks!
Will do. Still going through the recordings I made so far.
Long standing issue, no clear owner for the bug. Platform triage has looked at it several times. Let's call this fix-optional for 55 and 56. We could still take a patch for 56 or 55 though.
General update: I've been working on and off on figuring out what's going on here, using the rr recordings I made in June, but haven't figured out what's going on yet. I'm going to be away from the machine rr is on for the next three weeks, but when I get back I'd really like to set up another debugging session, if possible, and try to gather a bit more data...
Hi, Boris: Is there any update on this issue? Please let us know if we can help. Thanks,
Will do. Still swamped with a few other things. :(
Hi, Boris: Could you please help update the ETA of the issue? Anything we can help, please let us know. Thanks,
Realistically, what I'm going to need to make progress here is access to a way to reproduce that I can then debug at length. That means both time on my side, which has been very hard to come by, and coordinating with you to have the reproducible testcase available during that time. The fundamental problem is that I keep having to deal with emergencies of various sorts, which makes it very hard to keep a chunk of time free with enough advance notice that we could set up temporary access like last time. That's because I can't actually predict when I will be free to work on this more than about a day ahead at best. This is complicated by the fact that other things have been higher priorities, especially officially, so I more or less have to work on this on my own time. Given that, what are our options here? I understand that you don't want to leave access open indefinitely. On the other hand, I can't make headway with access limited to an hour or two, especially if I need to know far ahead of time when those two hours would be. Would you be OK with a 24-hour access window? If so, we could pick a day and I would endeavor to keep it as clear as I can so I can fit in as much debugging as possible during that window... I'm really sorry about the complications here, and appreciate you not just giving up on this!
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: