Closed Bug 1381846 Opened 7 years ago Closed 7 years ago

Talos tpaint fails locally when using --geckoProfile

Categories

(Testing :: Talos, defect)

55 Branch
defect
Not set
normal

Tracking

(firefox59 fixed)

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: igoldan, Assigned: rwood)

References

Details

(Whiteboard: [PI:January])

Attachments

(1 file)

When running | mach talos-test -a tpaint --geckoProfile | on my local machine, I get this error, which crashes the perf test. ... 16:42:23 INFO - starting webserver on 'localhost:54554' 16:42:23 INFO - SUITE-START | Running 1 tests 16:42:23 INFO - TEST-START | tpaint 16:42:23 INFO - Initialising browser for tpaint test... 16:42:23 INFO - Application command: e:\workspace\mozilla-source\mozilla-central\objdir-frontend\dist\bin\firefox --no-remote http://localhost:54554/getInfo.html -profile c:\users\ionut~1.gol\appdata\local\temp\tmpbostwg\profile 16:42:23 INFO - TEST-INFO | started process 18852 (e:\workspace\mozilla-source\mozilla-central\objdir-frontend\dist\bin\firefox --no-remote http://localhost:54554/getInfo.html) 16:42:27 INFO - TEST-INFO | 18852: exit 0 16:42:27 INFO - Clearing archive e:/workspace/mozilla-source/mozilla-central\testing\mozharness\build\blobber_upload_dir\profile_tpaint.zip 16:42:27 INFO - Activating Gecko Profiling. Temp. profile dir: c:\users\ionut~1.gol\appdata\local\temp\tmpyewdjg, interval: 1, entries: 2000000 16:42:27 INFO - Browser initialized. 16:42:27 INFO - Running cycle 1/1 for tpaint test... 16:42:27 INFO - TEST-INFO | started process 14056 (e:\workspace\mozilla-source\mozilla-central\objdir-frontend\dist\bin\firefox --no-remote -profile c:\users\ionut~1.gol\appdata\local\temp\tmpbostwg\profile -tpprofilinginfo {"gecko_profile_entries": 2000000, "gecko_profile_threads": "GeckoMain,Compositor", "gecko_profile_dir": "c:\\users\\ionut~1.gol\\appdata\\local\\temp\\tmpyewdjg", "gecko_profile_interval": 1} -tp file:\e:\workspace\mozilla-source\mozilla-central\testing\talos\talos\tests\tpaint\tpaint.manifest.develop -tpchrome -tpmozafterpaint -tpnoisy -tpcycles 1 -tppagecycles 2) 16:42:39 INFO - PID 14056 | RSS: Main: 192946176 16:42:39 INFO - PID 14056 | 16:42:40 INFO - PID 14056 | openingTime=209.1500000000001 16:42:40 INFO - PID 14056 | Cycle 1(1): loaded http://localhost:54554/tests/tpaint/tpaint.html?auto=1 (next: http://localhost:54554/tests/tpaint/tpaint.html?auto=1) 16:42:41 INFO - PID 14056 | RSS: Main: 198553600 16:42:41 INFO - PID 14056 | 16:42:42 INFO - PID 14056 | openingTime=206.70499999999993 16:42:42 INFO - PID 14056 | Cycle 1(2): loaded http://localhost:54554/tests/tpaint/tpaint.html?auto=1 (next: http://localhost:54554/tests/tpaint/tpaint.html?auto=1) 16:43:26 INFO - PID 14056 | Unable to read VR Path Registry from C:\Users\ionut.goldan\AppData\Local\openvr\openvrpaths.vrpath 16:52:27 INFO - Timeout waiting for test completion; killing browser... 16:52:27 INFO - mozcrash Saved minidump as e:/workspace/mozilla-source/mozilla-central\testing\mozharness\build\blobber_upload_dir\3c82d83e-e6d2-4d75-a05d-22a41296194a.dmp 16:52:28 INFO - PROCESS-CRASH | tpaint | application crashed [unknown top frame] 16:52:28 INFO - Crash dump filename: c:\users\ionut~1.gol\appdata\local\temp\tmpbostwg\profile\minidumps\3c82d83e-e6d2-4d75-a05d-22a41296194a.dmp 16:52:28 INFO - MINIDUMP_STACKWALK not set, can't process dump. 16:52:28 INFO - rmtree() failed for "('c:\\users\\ionut~1.gol\\appdata\\local\\temp\\tmpbostwg',)". Reason: The directory is not empty (41). Retrying... 16:52:28 INFO - TEST-UNEXPECTED-ERROR | tpaint | Found crashes after test run, terminating test 16:52:28 ERROR - Traceback (most recent call last): 16:52:28 INFO - File "e:/workspace/mozilla-source/mozilla-central\testing\talos\talos\run_tests.py", line 268, in run_tests 16:52:28 INFO - talos_results.add(mytest.runTest(browser_config, test)) 16:52:28 INFO - File "e:\workspace\mozilla-source\mozilla-central\testing\talos\talos\ttest.py", line 61, in runTest 16:52:28 INFO - return self._runTest(browser_config, test_config, setup) 16:52:28 INFO - File "e:\workspace\mozilla-source\mozilla-central\testing\talos\talos\ttest.py", line 197, in _runTest 16:52:28 INFO - test_config['name']) 16:52:28 INFO - File "e:\workspace\mozilla-source\mozilla-central\testing\talos\talos\ttest.py", line 45, in check_for_crashes 16:52:28 INFO - raise TalosCrash("Found crashes after test run, terminating test") 16:52:28 INFO - TalosCrash: Found crashes after test run, terminating test 16:52:28 INFO - TEST-INFO took 604575ms 16:52:28 INFO - SUITE-END | took 604s 16:52:28 ERROR - Return code: 2 16:52:28 WARNING - setting return code to 2 16:52:28 ERROR - # TBPL FAILURE # 16:52:28 INFO - Running post-action listener: _package_coverage_data 16:52:28 INFO - Running post-action listener: _resource_record_post_action 16:52:28 INFO - [mozharness: 2017-07-18 13:52:28.460000Z] Finished run-tests step (success) 16:52:28 INFO - Running post-run listener: _resource_record_post_run 16:52:28 INFO - Total resource usage - Wall time: 606s; CPU: 15.0%; Read bytes: 1366809600; Write bytes: 270309888; Read time: 1022080; Write time: 1416730 16:52:28 INFO - TinderboxPrint: CPU usage<br/>15.0% 16:52:28 INFO - TinderboxPrint: I/O read bytes / time<br/>1,366,809,600 / 1,022,080 16:52:28 INFO - TinderboxPrint: I/O write bytes / time<br/>270,309,888 / 1,416,730 16:52:28 INFO - TinderboxPrint: CPU idle<br/>2,070.1 (85.0%) 16:52:28 INFO - TinderboxPrint: CPU system<br/>92.0 (3.8%) 16:52:28 INFO - TinderboxPrint: CPU user<br/>263.7 (10.8%) 16:52:28 INFO - setup-mitmproxy - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0 16:52:28 INFO - run-tests - Wall time: 607s; CPU: 15.0%; Read bytes: 1366809600; Write bytes: 270309888; Read time: 1022080; Write time: 1416730 16:52:28 INFO - Running post-run listener: _upload_blobber_files 16:52:28 WARNING - Blob upload gear skipped. Missing cmdline options. ...
:igoldan, is this still an issue? If not, please close out the bug, thanks!
Flags: needinfo?(igoldan)
Blocks: 1425103
Summary: Talos test fails locally when using --geckoProfile → Talos tpaint fails locally when using --geckoProfile
Whiteboard: [PI:January]
This is still an issue, also reported by Xidorn Quan
Flags: needinfo?(igoldan)
Assignee: nobody → rwood
Status: NEW → ASSIGNED
Hey Ionut, this is working fine for me locally on OSX with the latest inbound: ./mach talos-test --activeTests tpaint --geckoProfile The test finishes fine, and the profile_tpaint.zip is created successfully in the local blobber_uplaod_dir. Can you please pull the latest and try it? Looks to me like it has been fixed somehow... unless it is platform specific. Thanks!
Flags: needinfo?(igoldan)
Yes, will check this on Windows 10 and Ubuntu 16.04.
Flags: needinfo?(igoldan)
(In reply to Robert Wood [:rwood] from comment #3) > ./mach talos-test --activeTests tpaint --geckoProfile Do you remember how long this took you?
Flags: needinfo?(rwood)
I ran this on Ubuntu and it worked. It produced the same file you mentioned for tpaint. On Windows 10 this still fails, with the same traceback. So you were right, this bug is platform specific.
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #6) > I ran this on Ubuntu and it worked. It produced the same file you mentioned > for tpaint. > > On Windows 10 this still fails, with the same traceback. So you were right, > this bug is platform specific. Ok thanks Ionut! I'll look at this on Windows.
Flags: needinfo?(rwood)
Getting it narrowed down with some pageloader debugging: 14:36:13 INFO - TEST-INFO | started process 3896 (c:\mozilla-source\mozilla-central\obj-i686-pc-mingw32\dist\bin\firefox --no-remote -profile c:\users\robert~1\appdata\local\temp\tmp20phhz\profile) 14:36:15 INFO - PID 3896 | inside plLoadPage 14:36:15 INFO - PID 3896 | about to call profiler beginTest 14:36:15 INFO - PID 3896 | adding our chrome message listeners 14:36:15 INFO - PID 3896 | about to call profiler.mark to mark opening the page 14:36:15 INFO - PID 3896 | inside startAndLoadURI, pageName is: 14:36:15 INFO - PID 3896 | http://localhost:50862/tests/tpaint/tpaint.html?auto=1 14:36:15 INFO - PID 3896 | we just grabbed our start_time 14:36:15 INFO - PID 3896 | about to call content.loadURI 14:36:15 INFO - PID 3896 | inside tpaint startTest 14:36:15 INFO - PID 3896 | inside tpaint scheduleNextWindow 14:36:16 INFO - PID 3896 | inside ContentListener, received: 14:36:16 INFO - PID 3896 | PageLoader:IdleCallbackSet 14:36:16 INFO - PID 3896 | inside ContentListener, received: 14:36:16 INFO - PID 3896 | PageLoader:IdleCallbackReceived 14:36:16 INFO - PID 3896 | inside ContentListener, received: 14:36:16 INFO - PID 3896 | PageLoader:LoadEvent 14:36:16 INFO - PID 3896 | inside plLoadHandlerMessage 14:36:17 INFO - PID 3896 | inside tpaint openWindow 14:36:17 INFO - PID 3896 | inside tpaint childIsOpen 14:36:17 INFO - PID 3896 | inside tpaint reportTimes 14:36:17 INFO - PID 3896 | openingTime=269.53999999999996 14:36:17 INFO - PID 3896 | inside ContentListener, received: 14:36:17 INFO - PID 3896 | PageLoader:RecordTime 14:36:17 INFO - PID 3896 | inside plRecordTime 14:36:17 INFO - PID 3896 | Cycle 1(1): loaded http://localhost:50862/tests/tpaint/tpaint.html?auto=1 (next: http://localhost:50862/tests/tpaint/tpaint.html?auto=1) 14:36:17 INFO - PID 3896 | *RW* inside plNextPage 14:36:17 INFO - PID 3896 | about to call await TalosParentProfiler.finishTest <hangs>
Getting closer! The profile is failing to be written to file: 16:12:59 INFO - PID 4836 | Cycle 1(1): loaded http://localhost:51825/tests/tpaint/tpaint.html?auto=1 (next: http://localhost:51825/tests/tpaint/tpaint.html?auto=1) 16:12:59 INFO - PID 4836 | inside talosParentProfiler.pause 16:12:59 INFO - PID 4836 | inside TalosPowersService.profilerPause 16:13:00 INFO - PID 4836 | *RW* inside plNextPage 16:13:00 INFO - PID 4836 | about to call await TalosParentProfiler.finishTest 16:13:00 INFO - PID 4836 | inside talosParentProfiler.finishTest 16:13:00 INFO - PID 4836 | about to call TalosPowers.profilerFinish 16:13:00 INFO - PID 4836 | inside TalosPowersService.profilerFinish 16:13:00 INFO - PID 4836 | about to call OS.File.writeAtomic with: 16:13:00 INFO - PID 4836 | profileFile: c:\users\robert~1\appdata\local\temp\tmpqoi8nr/tpaint.html?auto=1_pagecycle_1.profile 16:13:00 INFO - PID 4836 | array length: 1014340 <hangs here for 10 min then pageloader times out> Same debugging but with a cycle of perf-reftest-singletons-e10s works fine (same machine): 16:10:15 INFO - PID 6844 | Cycle 1(1): loaded http://localhost:51683/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://localhost:51683/tests/perf-reftest-singletons/bloom-basic.html) 16:10:15 INFO - PID 6844 | inside talosParentProfiler.pause 16:10:15 INFO - PID 6844 | inside TalosPowersService.profilerPause 16:10:15 INFO - PID 6844 | inside ContentListener, received: 16:10:15 INFO - PID 6844 | PageLoader:LoadEvent 16:10:15 INFO - PID 6844 | inside plLoadHandlerMessage 16:10:15 INFO - PID 6844 | inside ContentListener, received: 16:10:15 INFO - PID 6844 | PageLoader:IdleCallbackSet 16:10:15 INFO - PID 6844 | inside ContentListener, received: 16:10:15 INFO - PID 6844 | PageLoader:IdleCallbackReceived 16:10:15 INFO - PID 6844 | *RW* inside plNextPage 16:10:15 INFO - PID 6844 | about to call await TalosParentProfiler.finishTest 16:10:15 INFO - PID 6844 | inside talosParentProfiler.finishTest 16:10:15 INFO - PID 6844 | about to call TalosPowers.profilerFinish 16:10:15 INFO - PID 6844 | inside TalosPowersService.profilerFinish 16:10:27 INFO - PID 6844 | about to call OS.File.writeAtomic with: 16:10:27 INFO - PID 6844 | profileFile: c:\users\robert~1\appdata\local\temp\tmp9d66x6/bidi-resolution-1.html_pagecycle_1.profile 16:10:27 INFO - PID 6844 | array length: 120391048 16:10:27 INFO - PID 6844 | about to call Services.profiler.StopProfiler 16:10:27 INFO - PID 6844 | back from stopProfiler, resolving promise 16:10:27 INFO - PID 6844 | back from profiler finishTest 16:10:27 INFO - PID 6844 | doNextPage is true 16:10:27 INFO - PID 6844 | about to force GC 16:10:28 INFO - PID 6844 | about to settimeout to call plLoadPage 16:10:28 INFO - PID 6844 | inside plLoadPage 16:10:28 INFO - PID 6844 | about to call profiler beginTest 16:10:28 INFO - PID 6844 | inside talosParentProfiler.beginTest 16:10:28 INFO - PID 6844 | about to call TalosPowers.profilerBegin 16:10:28 INFO - PID 6844 | inside TalosPowersService.profilerBegin, about to call StartProfiler 16:10:28 INFO - PID 6844 | about to call PauseSamplingadding our chrome message listeners 16:10:28 INFO - PID 6844 | about to call profiler.mark to mark opening the page 16:10:28 INFO - PID 6844 | inside talosParentProfiler.mark 16:10:28 INFO - PID 6844 | inside TalosPowersService.profilerMarker 16:10:28 INFO - PID 6844 | inside startAndLoadURI, pageName is: 16:10:28 INFO - PID 6844 | http://localhost:51683/tests/perf-reftest-singletons/bidi-resolution-1.html 16:10:28 INFO - PID 6844 | we just grabbed our start_time 16:10:28 INFO - PID 6844 | about to call content.loadURI 16:10:29 INFO - PID 6844 | inside ContentListener, received: 16:10:29 INFO - PID 6844 | PageLoader:RecordTime 16:10:29 INFO - PID 6844 | inside plRecordTime 16:10:29 INFO - PID 6844 | Cycle 1(2): loaded http://localhost:51683/tests/perf-reftest-singletons/bidi-resolution-1.html (next: http://localhost:51683/tests/perf-reftest-singletons/bloom-basic.html) Ahhh... I'm guessing the profileFile name is invalid on Windows (but valid on linux/OSX) for tpaint: profileFile: c:\users\robert~1\appdata\local\temp\tmpqoi8nr/tpaint.html?auto=1_pagecycle_1.profile
Comment on attachment 8941223 [details] Bug 1381846 - Fix gecko profiling with talos tpaint on windows; https://reviewboard.mozilla.org/r/211488/#review217302 great find
Attachment #8941223 - Flags: review?(jmaher) → review+
Oh sweet this also fixes profiling with talos-chrome on Windows
Pushed by rwood@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/731b1321ba95 Fix gecko profiling with talos tpaint on windows; r=jmaher
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: