Closed Bug 1433923 Opened 7 years ago Closed 7 years ago

Use detailed failure messages for wdspec test failures (pytest)

Categories

(Testing :: web-platform-tests, enhancement, P1)

Product:

Component:

Version:

Version 3

Type:

enhancement

Priority:

P1

Severity:

normal

Tracking

(firefox60 fixed, firefox61 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla61

Tracking Flags:

Tracking

Status

firefox60

---

fixed

firefox61

---

fixed

People

(Reporter: whimboo, Assigned: whimboo)

Details

Attachments

(1 file)

Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). 7 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/x-review-board-request	jgraham : review+	Details

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Description

•

7 years ago

The current situation is that we have basically catch all failures bugs reported on Bugzilla of type: > TEST-UNEXPECTED-FAIL | /webdriver/tests/actions/mouse.py | mouse.py::test_click_element_center By that line you don't see anything what went wrong, and whatever how many failures potentially could be present, all the reported failures on Treeherder would be starred against this single bug. Can we extend the output by adding the exact failure like: > TEST-UNEXPECTED-FAIL | /webdriver/tests/actions/mouse.py mouse.py::test_click_element_center | AssertionError: 342 == 344 This would help a lot in triaging and fixing known failures.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 1

•

7 years ago

James, is there anything which would prevent us from making such a change in reporting? If not I would still have to figure out how to make that happen with pytest. Maybe Dave knows.

Flags: needinfo?(james)

James Graham [:jgraham]

Comment 2

•

7 years ago

I don't think this is really related to pytest. The pytest results are collected and reported in mozlog format at [1]. That's then serialized in tbpl format at [2]. The format there is "TEST-<STATUS> | <test name> | <subtest name> - <message>\n<stack>". So probably you want to ensure that the message property is populated with something. [1] https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/executors/pytestrunner/runner.py#86 [2] https://searchfox.org/mozilla-central/source/testing/mozbase/mozlog/mozlog/formatters/tbplformatter.py#25

Flags: needinfo?(james)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 3

•

7 years ago

The message comes with the data, but is only used when it is present in the dictionary: https://searchfox.org/mozilla-central/rev/b29daa46443b30612415c35be0a3c9c13b9dc5f6/testing/mozbase/mozlog/mozlog/formatters/tbplformatter.py#175 > message = "- " + data["message"] if "message" in data else "" Given that we do not get a message at all it means that no message is present. Maybe we fail to set this property, or are trying to retrieve the wrong information from the pytest results.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 4

•

7 years ago

Given that I do good amount of triage work for those tests too I will take this bug, which IMO is kinda important to fix.

Assignee: nobody → hskupin

Status: NEW → ASSIGNED

Priority: -- → P1

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 5

•

7 years ago

Running wdspec tests locally clearly give me a null message for test failures inside the errorsummary.log file: {"status": "FAIL", "group": "default", "line": 169, "subtest": "test_example", "action": "test_result", "test": "/webdriver/tests/cookies/add_cookie.py", "message": null, "stack": "session = <webdriver.client.Session object at 0x10e0811d0>, url = <function url at 0x10ba16398>\nserver_config = {'aliases': [], 'bind_hostname': False, 'certificate': '/Volumes/data/code/gecko/testing/web-platform/certs/web-platform.test.pem', 'check_subdomains': False, ...}\n\n def test_example(session, url, server_config):\n> assert \"asdf\" in \"aaaaaaaa\"\nE AssertionError: assert 'asdf' in 'aaaaaaaa'\n\nserver_config = {'aliases': [], 'bind_hostname': False, 'certificate': '/Volumes/data/code/gecko/testing/web-platform/certs/web-platform.test.pem', 'check_subdomains': False, ...}\nsession = <webdriver.client.Session object at 0x10e0811d0>\nurl = <function url at 0x10ba16398>\n\ntesting/web-platform/tests/webdriver/tests/cookies/add_cookie.py:181: AssertionError", "expected": "PASS"} Going up the stack and checking `TestRunnerManager.test_ended()` the passed in test results do not contain a message, which means that we miss to populate the `message` field in the call to `WdspecRun()`. In `do_wdspec()` we call `pytestrunner.run()`, which passes in a `SubtestResultRecorder` into the `pytest.main()` call. So I'm fairly sure this result recorder class doesn't work as expected for pytest based executors.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 6

•

7 years ago

So it's indeed a bug in our test result collection from pytest. We do not explicitly set a message for `record()` and as such it is kept as None: https://dxr.mozilla.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/executors/pytestrunner/runner.py#104 By using the `pytest_runtest_logreport` hook, I do not see any chance for us to have a simple failure message available. Only `report.longrepr` contains it in the stack. So I could extract the message like: > def record_fail(self, report): > match = re.search("^E\s(.*)", report.longreprtext, re.MULTILINE) > if match: > message = match.group(1).strip() > self.record(report.nodeid, "FAIL", message=message, stack=report.longrepr) It will result in a message like `"message": "AssertionError: assert 'asdf' in 'aaaaaaaa'"` for the above shown stack.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 7

•

7 years ago

Btw the pytest-html plugin also uses this method to retrieve the assertion details: https://github.com/pytest-dev/pytest-html/blob/c61dc5bbe08fc7dfb20d27df8adc7a61eb04a902/pytest_html/plugin.py#L230-L239

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 8

•

7 years ago

I pushed a try build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=def3228e264856c912e6be58366af012f8f919b2&selectedJob=169642861 The failing test is getting reported as: > TEST-UNEXPECTED-FAIL | /webdriver/tests/cookies/add_cookie.py | test_example - AssertionError: assert 'asdf' in 'aaaaaaaa' I think that doesn't apply to our requirements for Treeherder log parsing. After the second `|` only the test failure has to be shown. So the subtest name would have to be moved.

Comment hidden (mozreview-request)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

7 years ago

Attachment #8961345 - Flags: review?(james)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 10

•

7 years ago

(In reply to James Graham [:jgraham] from comment #2) > That's then serialized in tbpl format at [2]. The format there is: > "TEST-<STATUS> | <test name> | <subtest name> - <message>\n<stack>". Is that format really what we want? It differs from all the other test jobs as mentioned in my last comment already. If we want to get this fixed I would suggest to file a new bug given that it is unrelated to the message part.

Flags: needinfo?(james)

James Graham [:jgraham]

Comment 11

•

7 years ago

mozreview-review

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review235780 ::: testing/web-platform/tests/tools/wptrunner/wptrunner/executors/pytestrunner/runner.py:106 (Diff revision 1) > > def record_pass(self, report): > self.record(report.nodeid, "PASS") > > def record_fail(self, report): > - self.record(report.nodeid, "FAIL", stack=report.longrepr) > + match = re.search("^E\s(.*)", report.longreprtext, re.MULTILINE) So… you are looking for a line that starts with "E"? It's very unclear what's actually going on here.

Attachment #8961345 - Flags: review?(james) → review-

James Graham [:jgraham]

Comment 12

•

7 years ago

>> That's then serialized in tbpl format at [2]. The format there is: >> "TEST-<STATUS> | <test name> | <subtest name> - <message>\n<stack>". > Is that format really what we want? It differs from all the other test jobs as mentioned in my last comment already. If we want to get this fixed I would suggest to file a new bug given that it is unrelated to the message part. That was a statement of fact. The format is carefully designed to work with the treeherder parser, and I'm not proposing changing anything.

Flags: needinfo?(james)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 13

•

7 years ago

mozreview-review-reply

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review235780 > So… you are looking for a line that starts with "E"? It's very unclear what's actually going on here. Yes, the error details are listed as following in the stack: > def test_example(session, url, server_config): > > assert "asdf" in "aaaaaaaa" > E AssertionError: assert 'asdf' in 'aaaaaaaa' In pytest-html Dave looks for a line starting with `E `. So would you be fine when it requires at least three blanks afterward?

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 14

•

7 years ago

mozreview-review-reply

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review235780 > Yes, the error details are listed as following in the stack: > > > def test_example(session, url, server_config): > > > assert "asdf" in "aaaaaaaa" > > E AssertionError: assert 'asdf' in 'aaaaaaaa' > > In pytest-html Dave looks for a line starting with `E `. So would you be fine when it requires at least three blanks afterward? The above is not formattet correctly. Maybe this time it is better: ``` def test_example(session, url, server_config): > assert "asdf" in "aaaaaaaa" E AssertionError: assert 'asdf' in 'aaaaaaaa' ```

James Graham [:jgraham]

Comment 15

•

7 years ago

mozreview-review-reply

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review235780 > The above is not formattet correctly. Maybe this time it is better: > > ``` > def test_example(session, url, server_config): > > assert "asdf" in "aaaaaaaa" > E AssertionError: assert 'asdf' in 'aaaaaaaa' > ``` So the most important thing is to document what we're trying to match since this is basically an ad-hoc parser. So the minimum requirement here is a comment. It is also possible that the code will look simpler like ``` # pytest outputs the stacktrace followed by an error message prefixed by the letter E # e.g. # <example> msg = "" for line in report.longreprtext.splitlines(): if line.startswith("E"): msg = line[1:].strip() break ``` But it depends what the exact semantics are. Can you have > 1 E-prefixed line? Should you take everything after the first E prefixed line?

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 16

•

7 years ago

mozreview-review-reply

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review235780 > So the most important thing is to document what we're trying to match since this is basically an ad-hoc parser. So the minimum requirement here is a comment. > > It is also possible that the code will look simpler like > ``` > # pytest outputs the stacktrace followed by an error message prefixed by the letter E > # e.g. > # <example> > msg = "" > for line in report.longreprtext.splitlines(): > if line.startswith("E"): > msg = line[1:].strip() > break > ``` > > But it depends what the exact semantics are. Can you have > 1 E-prefixed line? Should you take everything after the first E prefixed line? Dave pointed me to https://github.com/pytest-dev/pytest/blob/ed118d7f2051ccfdb5c56cb6535e6f89485b1225/_pytest/_code/code.py#L531-L541, which shows that by default an error line is prefixed with "E ". I think for our use it's best to just limit to the first line as summary for the failure log. Also I have never seen a multi-line error yet. So I think your example code is better because it returns immediately after the first error line. The regex would return everything.

Comment hidden (mozreview-request)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 18

•

7 years ago

A try build with a forced failure is available here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f5ceba4388fff3a1cb236ba4d0c2e3d89c106603

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 19

•

7 years ago

Also I filed an issue against pytest to maybe get a property which would allow directly fetching the assertion message: https://github.com/pytest-dev/pytest/issues/3343

James Graham [:jgraham]

Comment 20

•

7 years ago

mozreview-review

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review236780 ::: testing/web-platform/tests/tools/wptrunner/wptrunner/executors/pytestrunner/runner.py:15 (Diff revision 2) > """ > > import errno > import json > import os > +import re Unused. ::: testing/web-platform/tests/tools/wptrunner/wptrunner/executors/pytestrunner/runner.py (Diff revision 2) > + > + self.record(report.nodeid, "FAIL", message=message, stack=report.longrepr) > > def record_error(self, report): > # error in setup/teardown > - if report.when != "call": Why this change?

Attachment #8961345 - Flags: review?(james) → review+

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 21

•

7 years ago

mozreview-review-reply

Comment on attachment 8961345 [details] Bug 1433923 - Use detailed failure message for wdspec test failures (pytest). https://reviewboard.mozilla.org/r/230134/#review236780 > Why this change? We already have this check above. But I will revert it so that we only touch the necessary code.

Comment hidden (mozreview-request)

Comment 23

•

7 years ago

Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/849020c741de Use detailed failure message for wdspec test failures (pytest). r=jgraham

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 24

•

7 years ago

Sheriffs please note that with this change the bug suggestions will change for wdspec tests. Especially for all of those which are not timeout failures. Existing bugs will need an update of the summary, or if filed as a new bug, I will dupe them later today or tomorrow. Thanks.

Bogdan Tara[:bogdan_tara | bogdant]

Comment 25

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/849020c741de

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

status-firefox61: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla61

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 26

•

7 years ago

I would like to see this patch uplifted to mozilla-beta for a better failure discovery of wd test jobs.

status-firefox60: --- → affected

Whiteboard: [checkin-needed-beta]

Cosmin Sabou [:CosminS]

Comment 27

•

7 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/f66887206a10

status-firefox60: affected → fixed

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

Whiteboard: [checkin-needed-beta]

Web Platform Test Sync Bot (Matrix: #interop:mozilla.org)

Comment 28

•

7 years ago

Created web-platform-tests PR https://github.com/w3c/web-platform-tests/pull/10215 for changes under testing/web-platform/tests

Web Platform Test Sync Bot (Matrix: #interop:mozilla.org)

Comment 29

•

7 years ago

Can't merge web-platform-tests PR due to failing upstream checks: Github PR https://github.com/w3c/web-platform-tests/pull/10215 * continuous-integration/travis-ci/pr (https://travis-ci.org/w3c/web-platform-tests/builds/359504058?utm_source=github_status&utm_medium=notification)

Web Platform Test Sync Bot (Matrix: #interop:mozilla.org)

Comment 30

•

7 years ago

Upstream PR merged

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 31

•

7 years ago

Hm, why didn't the linter pick up this indentation error: + message = "" + for line in report.longreprtext.splitlines(): + if line.startswith("E "): + message = line[1:].strip() That is the reason why the upstream merge failed. Maja and James, I assume I don't have to fix it in central given that the next merge will fix it automatically, right?

Flags: needinfo?(mjzffr)

Flags: needinfo?(james)

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Comment 32

•

7 years ago

Because testing/web-platform isn't enabled for the flake8 linter: https://searchfox.org/mozilla-central/source/tools/lint/flake8.yml#4

Flags: needinfo?(ahalberstadt)

Maja Frydrychowicz :impossibus (was :maja_zf) (needinfo me)

Comment 33

•

7 years ago

Right.

Flags: needinfo?(mjzffr)

Flags: needinfo?(james)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 34

•

7 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #32) > Because testing/web-platform isn't enabled for the flake8 linter: > https://searchfox.org/mozilla-central/source/tools/lint/flake8.yml#4 So as informed by Andreas we have a different set of linter settings available for web-platform-tests, and those should be run via `mach lint -l wpt -l wpt_manifest` (https://developer.mozilla.org/en-US/docs/Mozilla/QA/web-platform-tests#Lint). But whatever I do I cannot get the linter to fail for the above broken indendation. Two questions: 1) How do I have to run the linter to really lint testing/web-platform/tests/tools/wptrunner/wptrunner/executors/pytestrunner/runner.py for Python code 2) Is there a way to get this linter integrated with any linting job on Taskcluster?

Flags: needinfo?(james)

Flags: needinfo?(ahalberstadt)

James Graham [:jgraham]

Comment 35

•

7 years ago

We should enable pyflakes for the same directories and with the same settings as upstream travis does i.e. https://github.com/w3c/web-platform-tests/blob/master/tools/tox.ini#L25

Flags: needinfo?(james)

Andrew Halberstadt [:ahal]

Comment 36

•

7 years ago

Sounds like the lint integration doesn't automatically set up pyflakes the same way it does upstream. I consider this a bug in the linter implementation here: https://searchfox.org/mozilla-central/source/tools/lint/wpt/wpt.py Standard8 and I are trying to push new restrictions on our linters such that they just work out of the box (without requiring any additional setup on the part of the user). So please feel free to file under Firefox Build System :: Lint and Formatting if you like. As far as taskcluster, there already is a job: https://searchfox.org/mozilla-central/source/taskcluster/ci/source-test/mozlint.yml#142 It'll only run if you modified one of those files. It might also be hitting the same pyflakes issue that you are though.

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Comment 37

•

7 years ago

I filed bug 1451092.

Web Platform Test Sync Bot (Matrix: #interop:mozilla.org)

Comment 38

•

7 years ago

Upstream PR merged

Web Platform Test Sync Bot (Matrix: #interop:mozilla.org)

Comment 39

•

7 years ago

Upstream PR merged

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 40

•

7 years ago

Maja, is it expected that we get the "Upstream PR merged" comment twice from the wpt-sync bot?

Flags: needinfo?(mjzffr)

Web Platform Test Sync Bot (Matrix: #interop:mozilla.org)

Comment 41

•

7 years ago

Upstream PR merged

Maja Frydrychowicz :impossibus (was :maja_zf) (needinfo me)

Comment 42

•

7 years ago

This may me due to some semi-manual updates we've done here and there to catch up with bugs that have been missed by the upstreaming process. Many fixes to upstreaming have been deployed today, so expect a change in activity.

Flags: needinfo?(mjzffr)

You need to log in before you can comment on or make changes to this bug.