Closed Bug 390845 Opened 17 years ago Closed 17 years ago

integrate Talos with new pageloader

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: anodelman, Assigned: anodelman)

References

Details

Attachments

(2 files, 10 obsolete files)

This patch removes the code for framecycler.html and replaces it with the new pageloader test.  It works with either a firefox built with --enable-tests or installs the pageloader if it isn't included in the given firefox build (assuming the end user supplies a correctly formatted pageloader directory to copy chrome and components information from).  

Checking in this patch will remove framecycler.html from Talos entirely.  If we are concerned about taking such a step it could be re-written to leave the option of running the old page cycler along with support for the new.  As far as I could tell we were interesting in shifting to use the new pageloader exclusively, which is why I chose to simply remove all support for framecycler.html.

This patch also fixes a problem with the correct creation of prefs.js files on unix based systems.  The bug has been present in the code since the creating of the linux port but was not noticeable due to the presence of a default prefs.js file in the base_profile directory - which has since be removed.
Attached patch new pageloader integration (obsolete) (deleted) — Splinter Review
Attachment #275155 - Flags: review?(rhelmer)
I tested this out on mac and everything is sunshine, lollipops and rainbows.
Yay sunshine, lollipops and rainbows!
no ponies? :(
Attachment #275155 - Flags: review?(rhelmer) → review+
Even thought this is r+ do not check in.

We are going to await further testing on the new perf boxes to see if the numbers collected are consistent and useful.

We'll wait for the numbers before considering this patch fully tested.
Attached patch new pageloader integration take 2 (obsolete) (deleted) — Splinter Review
This new patch is integrated with myk's bug fix for 392748 (talos should use PyYAML instead of Syck to read config file)
Assignee: nobody → anodelman
Attachment #275155 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attached patch new pageloader integration take 3 (obsolete) (deleted) — Splinter Review
Fixed a typo in the previous patch.
Attachment #277450 - Attachment is obsolete: true
Attached patch talos 2 (obsolete) (deleted) — Splinter Review
This became a far more major patch.  Here are the highlights:
- incorporates the new pageloader (still supports old js-style tp as well)
- removed cygwin dependency on windows
- created a generalized framework for running url based tests (ttest.py) to replace ts.py/tp.py
- reorganized how configuration is handled in config.py and sample.config, can now control which tests are run and with what settings from sample.config
- removed -height/-width from the firefox command line, it wasn't doing what we wanted it to do, now we depend on those settings from the localstore.rdf in the base_profile
- added support to dump data locally to csv files; you can either send to graph server, dump data locally or both
- all signs point to a (mostly) drag and drop ability to incorporate tdhtml
- general file cleanup (changed the names of some files to better reflect what they do, removed some unnecessary files)

I've successfully run talos2 on windows and linux.
Attachment #277455 - Attachment is obsolete: true
Attachment #277963 - Flags: review?(rhelmer)
Wow! It all sounds good. I'll try it out on OS X tomorrow and let you know how it goes.
Attached patch talos 2 again (obsolete) (deleted) — Splinter Review
Minor change to the configuration file to make it run tp with 20 cycles through the web page set and run ts 20 times total.
Attachment #277963 - Attachment is obsolete: true
Attachment #278072 - Flags: superreview?(bhearsum)
Attachment #278072 - Flags: review?(rhelmer)
Attachment #277963 - Flags: review?(rhelmer)
Attached patch talos 2 fix (obsolete) (deleted) — Splinter Review
Fixed a bug found by bhearsum, wherein all the pageloader files ended up in the chrome directory - instead of being correctly being placed in chrome or components.

Also - as the new perf farm machines cycle so quickly I've upped the number of times to cycle through the web page set to 20.
Attachment #278072 - Attachment is obsolete: true
Attachment #278093 - Flags: review?(bhearsum)
Attachment #278072 - Flags: superreview?(bhearsum)
Attachment #278072 - Flags: review?(rhelmer)
Comment on attachment 278093 [details] [diff] [review]
talos 2 fix

A couple things:
* Is it possible to get rid of config.py completely? It would be really nice to only have one config file.
* I couldn't get this working on Mac. Running with Debug gave me "WARNING: problem starting counter monitor" errors. I got a traceback at the end, too:
Traceback (most recent call last):
  File "run_tests.py", line 249, in ?
    test_file(sys.argv[i])
  File "run_tests.py", line 242, in test_file
    send_to_csv(results)
  File "run_tests.py", line 109, in send_to_csv
    writer.writerow([i, page, float(r[2]), float(r[3]), float(r[4]), float(r[5]), '|'.join(r[6:])])
ValueError: invalid literal for float(): undefined

I think this is a result of the CounterManager not working.
Attachment #278093 - Flags: review?(bhearsum) → review-
I'm having the same problems with Linux, too.
Attached patch talos 2 (obsolete) (deleted) — Splinter Review
I didn't see the error that you observed on mac.  I did catch another couple of exceptions that are now caught and dealt with (included in this patch).

I'm wondering if you are hitting a configuration problem of some sort.  Could you please test this new patch on a perf farm machine so that I can come and check out what errors you are seeing since I can't seem to get the same ones.
Attachment #278093 - Attachment is obsolete: true
Attachment #278504 - Flags: review?(bhearsum)
Attached patch talos 2 (obsolete) (deleted) — Splinter Review
*sigh* I submitted the wrong patch.  This should be the correct one.
Attachment #278504 - Attachment is obsolete: true
Attachment #278505 - Flags: review?(bhearsum)
Attachment #278504 - Flags: review?(bhearsum)
Attachment #278505 - Attachment is patch: true
Attachment #278505 - Attachment mime type: application/octet-stream → text/plain
I had a few problems:
On Mac, Minefield closes the window at the end of the last test but does not quit. I had to use "Force Quit" to make it quit. It shuts down fine at other times.
Linux and Mac threw errors at the end, here's the traceback:
Traceback (most recent call last):
  File "run_tests.py", line 249, in <module>
    test_file(sys.argv[i])
  File "run_tests.py", line 240, in test_file
    send_to_graph(title, date, browser_config, results)
  File "run_tests.py", line 149, in send_to_graph
    tmpf.write(result_format % (float(r[2]), res, tbox, i, date, browser_config['branch'], browser_config['buildid'], "discrete", page))
ValueError: invalid literal for float(): undefined
Windows threw a different error at the end:
Traceback (most recent call last):
  File "run_tests.py", line 249, in ?
    test_file(sys.argv[i])
  File "run_tests.py", line 230, in test_file
    res, browser_dump, counter_dump = ttest.runTest(browser_config, tests[test])

  File "C:\talos\ttest.py", line 178, in runTest
    cm.stopMonitor()
  File "C:\talos\cmanager_win32.py", line 96, in stopMonitor
    win32pdh.RemoveCounter(self.registeredCounters[counter][1])
pywintypes.error: (-1073738820, 'RemoveCounter', 'No error message is available'
)

Ping me on irc and I'll show you the configs that I used.
OK. So I retested after doing the following:
* Switch over all machines to the new page_load_test directory (Windows was already using it, actually)
* Remove "% Processor Time" counter from all machines

All 3 of the platforms failed with the same errors.
To clarify, this is what happened on the different platforms:
Linux:
Gives me the following traceback after all tests have been run:
Traceback (most recent call last):
  File "run_tests.py", line 249, in <module>
    test_file(sys.argv[i])
  File "run_tests.py", line 240, in test_file
    send_to_graph(title, date, browser_config, results)
  File "run_tests.py", line 149, in send_to_graph
    tmpf.write(result_format % (float(r[2]), res, tbox, i, date,
browser_config['branch'], browser_config['buildid'], "discrete", page))
ValueError: invalid literal for float(): undefined
Mac:
Gives me the same traceback as above. Minefield must be closed with "Force Quit".
Windows:
Gives me the following traceback after all tests have been run:
Traceback (most recent call last):
  File "run_tests.py", line 249, in ?
    test_file(sys.argv[i])
  File "run_tests.py", line 230, in test_file
    res, browser_dump, counter_dump = ttest.runTest(browser_config,
tests[test])

  File "C:\talos\ttest.py", line 178, in runTest
    cm.stopMonitor()
  File "C:\talos\cmanager_win32.py", line 96, in stopMonitor
    win32pdh.RemoveCounter(self.registeredCounters[counter][1])
pywintypes.error: (-1073738820, 'RemoveCounter', 'No error message is
available'
)
Attached patch talos 2 (obsolete) (deleted) — Splinter Review
I think that we settled out a lot of the errors by fixing the configuration of apache to point to the correct Document Root.  In terms of the next batch of errors:
- for the float() error I fixed up the handling of Nan and Undefined numbers, but it can also be avoided by cycling through the web pages more than 1 time
- the windows error was caused by a mis-configuration (you were asking it to collect RSS as a windows counter - which is for linux/mac).  It could have handled it more gracefully, but I'm willing to push that out to future talos work
- I can't reproduce the mac error
Attachment #278659 - Flags: review?(bhearsum)
Attachment #278505 - Attachment is obsolete: true
Attachment #278505 - Flags: review?(bhearsum)
I got a new error on Mac this morning:
running pageload tests
[{'extensions.checkCompatibility': False, 'browser.shell.checkDefaultBrowser': False, 'network.proxy.type': 1, 'network.proxy.http': 'localhost', 'dom.disable_window_flip': True, 'dom.disable_window_move_resize': True, 'security.enable_java': False, 'dom.disable_open_during_load': False, 'network.proxy.http_port': 80, 'extensions.update.notifyUser': False, 'capability.principal.codebase.p0.id': 'file://', 'browser.dom.window.dump.enabled': True, 'dom.allow_scripts_to_close_windows': True, 'capability.principal.codebase.p1.granted': 'UniversalXPConnect', 'capability.principal.codebase.p0.granted': 'UniversalPreferencesWrite UniversalXPConnect UniversalPreferencesRead'}, {}, '../*.app/Contents/MacOS/firefox', 1.8, '200708300420', 'base_profile/', {'NO_EM_RESTART': 1}]
created profile
Screen width:1024 Screen height:768 colorDepth:32
initialized firefox
/bin/sh: line 1: -width: command not found
got tp results from browser
../BonEcho.app/Contents/MacOS/run-mozilla.sh: line 424: 20824 Terminated              "$prog" ${1+"$@"}
finished tp
formating results for: ts
# of values: 1
formating results for: loadtime
# of values: 396
formating results for: Private Bytes
# of values: 4250
formating results for: RSS
# of values: 4250
formating results for: % Processor Time
# of values: 0
finished formating results
Traceback (most recent call last):
  File "run_tests.py", line 237, in ?
    test_file(sys.argv[i])
  File "run_tests.py", line 207, in test_file
    ret = post_file.post_multipart(config.RESULTS_SERVER, config.RESULTS_LINK, [("key", "value")], [("filename", filename, file_data)
  File "/Users/mozqa/talos-slave/mac-branch/talos/post_file.py", line 23, in post_multipart
    h.send(body)
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/python2.4/httplib.py", line 658, in send
    self.sock.sendall(str)
  File "<string>", line 1, in sendall
socket.error: (35, 'Resource temporarily unavailable')

I'm going to reboot the Mac and see if that helps. This doesn't seem like a problem with Talos.
Attached patch talos 2 (deleted) — Splinter Review
Mostly mac fixes now:
- ensure that the browser closes and test completes when a page exceeds the page load time limit (a page can take no longer than 55 seconds to load, if it does we consider a failure and stop the test)
- now divides up the results into 500 line chunks to send to the graph server.  This is to handle a python socket issue on mac os x with python 2.4.4.  With this chunk size we don't get a socket error and everything sends successfully.  As a note, this actually isn't a bug introduced with talos 2 and exists in the currently checked in version of talos.  Another new talos 2 feature!

I've been working pretty much exclusively on mac for this testing, I'm assuming that these changes should not effect linux/windows.
Attachment #278659 - Attachment is obsolete: true
Attachment #279122 - Flags: review?(bhearsum)
Attachment #278659 - Flags: review?(bhearsum)
Comment on attachment 279122 [details] [diff] [review]
talos 2

This patch looks good to me. I would've liked to run a long test on Linux and Windows just to be extra careful, but I don't have enough time. Give that the data chunks are smaller now I don't envision any problems (barring any more crazy Python bugs). r=bhearsum.

PerfConfigurator.py got checked into talos/ a few days ago. I've got a version of it that works with Talos2. I'm going to attach it here and recommend it gets checked in along with Talos2.
Attachment #279122 - Flags: review?(bhearsum) → review+
Attached file Talos2 PerfConfigurator (obsolete) (deleted) —
The aforementioned version of PerfConfigurator. This is used by all of the Talos Buildbot's.
Attachment #279140 - Flags: review?(rcampbell)
Attachment #279140 - Flags: review?(anodelman)
cvs commit: Examining .
cvs commit: Examining base_profile
cvs commit: Examining base_profile/Cache
cvs commit: Examining base_profile/bookmarkbackups
cvs commit: Examining page_load_test
cvs commit: Examining startup_test
RCS file: /cvsroot/mozilla/testing/performance/talos/cmanager_linux.py,v
done
Checking in cmanager_linux.py;
/cvsroot/mozilla/testing/performance/talos/cmanager_linux.py,v  <--  cmanager_linux.py
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/testing/performance/talos/cmanager_mac.py,v
done
Checking in cmanager_mac.py;
/cvsroot/mozilla/testing/performance/talos/cmanager_mac.py,v  <--  cmanager_mac.py
initial revision: 1.1
done
RCS file: /cvsroot/mozilla/testing/performance/talos/cmanager_win32.py,v
done
Checking in cmanager_win32.py;
/cvsroot/mozilla/testing/performance/talos/cmanager_win32.py,v  <--  cmanager_win32.py
initial revision: 1.1
done
Checking in config.py;
/cvsroot/mozilla/testing/performance/talos/config.py,v  <--  config.py
new revision: 1.7; previous revision: 1.6
done
Removing ffinfo.py;
/cvsroot/mozilla/testing/performance/talos/ffinfo.py,v  <--  ffinfo.py
new revision: delete; previous revision: 1.1
done
Checking in ffprocess.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess.py,v  <--  ffprocess.py
new revision: 1.5; previous revision: 1.4
done
Checking in ffprocess_linux.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess_linux.py,v  <--  ffprocess_linux.py
new revision: 1.4; previous revision: 1.3
done
Checking in ffprocess_mac.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess_mac.py,v  <--  ffprocess_mac.py
new revision: 1.2; previous revision: 1.1
done
Checking in ffprocess_win32.py;
/cvsroot/mozilla/testing/performance/talos/ffprocess_win32.py,v  <--  ffprocess_win32.py
new revision: 1.3; previous revision: 1.2
done
Removing ffprofile.py;
/cvsroot/mozilla/testing/performance/talos/ffprofile.py,v  <--  ffprofile.py
new revision: delete; previous revision: 1.4
done
RCS file: /cvsroot/mozilla/testing/performance/talos/ffsetup.py,v
done
Checking in ffsetup.py;
/cvsroot/mozilla/testing/performance/talos/ffsetup.py,v  <--  ffsetup.py
initial revision: 1.1
done
Checking in getInfo.html;
/cvsroot/mozilla/testing/performance/talos/getInfo.html,v  <--  getInfo.html
new revision: 1.3; previous revision: 1.2
done
Removing initialize.html;
/cvsroot/mozilla/testing/performance/talos/initialize.html,v  <--  initialize.html
new revision: delete; previous revision: 1.3
done
Checking in run_tests.py;
/cvsroot/mozilla/testing/performance/talos/run_tests.py,v  <--  run_tests.py
new revision: 1.7; previous revision: 1.6
done
Checking in sample.config;
/cvsroot/mozilla/testing/performance/talos/sample.config,v  <--  sample.config
new revision: 1.5; previous revision: 1.4
done
Removing tp.py;
/cvsroot/mozilla/testing/performance/talos/tp.py,v  <--  tp.py
new revision: delete; previous revision: 1.7
done
Removing tp_linux.py;
/cvsroot/mozilla/testing/performance/talos/tp_linux.py,v  <--  tp_linux.py
new revision: delete; previous revision: 1.2
done
Removing tp_mac.py;
/cvsroot/mozilla/testing/performance/talos/tp_mac.py,v  <--  tp_mac.py
new revision: delete; previous revision: 1.1
done
Removing tp_win32.py;
/cvsroot/mozilla/testing/performance/talos/tp_win32.py,v  <--  tp_win32.py
new revision: delete; previous revision: 1.1
done
Removing ts.py;
/cvsroot/mozilla/testing/performance/talos/ts.py,v  <--  ts.py
new revision: delete; previous revision: 1.7
done
RCS file: /cvsroot/mozilla/testing/performance/talos/ttest.py,v
done
Checking in ttest.py;
/cvsroot/mozilla/testing/performance/talos/ttest.py,v  <--  ttest.py
initial revision: 1.1
done
Checking in base_profile/prefs.js;
/cvsroot/mozilla/testing/performance/talos/base_profile/prefs.js,v  <--  prefs.js
new revision: 1.4; previous revision: 1.3
done
Checking in page_load_test/framecycler.html;
/cvsroot/mozilla/testing/performance/talos/page_load_test/framecycler.html,v  <--  framecycler.html
new revision: 1.6; previous revision: 1.5
done
RCS file: /cvsroot/mozilla/testing/performance/talos/page_load_test/manifest.txt,v
done
Checking in page_load_test/manifest.txt;
/cvsroot/mozilla/testing/performance/talos/page_load_test/manifest.txt,v  <--  manifest.txt
initial revision: 1.1
done
Checking in startup_test/startup_test.html;
/cvsroot/mozilla/testing/performance/talos/startup_test/startup_test.html,v  <--  startup_test.html
new revision: 1.4; previous revision: 1.3
done
Attached patch talos2PerfConfig.patch (deleted) — Splinter Review
enpatchified version of PerfConfigurator.py
Attachment #279140 - Attachment is obsolete: true
Attachment #280082 - Flags: review?(anodelman)
Attachment #279140 - Flags: review?(rcampbell)
Attachment #279140 - Flags: review?(anodelman)
Attachment #280082 - Flags: review?(anodelman) → review+
cvs commit: Examining .
cvs commit: Examining base_profile
cvs commit: Examining base_profile/Cache
cvs commit: Examining base_profile/bookmarkbackups
cvs commit: Examining page_load_test
cvs commit: Examining startup_test
Checking in PerfConfigurator.py;
/cvsroot/mozilla/testing/performance/talos/PerfConfigurator.py,v  <--  PerfConfigurator.py
new revision: 1.3; previous revision: 1.2
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Mass move of Core:Testing bugs to mozilla.org:Release Engineering:Talos. Filter on RelEngTalosMassMove to ignore.
Component: Testing → Release Engineering: Talos
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Component: Release Engineering: Talos → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: