Open Bug 1730293 Opened 3 years ago Updated 2 years ago

Improve release and dev config insta-check test failure experience and check workflow (motivation: check failure in `check_glob@rust__third_partyeucliddef__html.snap`)

Tracking

(Not tracked)

Status:

NEW

People

(Reporter: asuth, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(3 files)

review failure output from the log 3 years ago Andrew Sutherland [:asuth] (he/him) (deleted), text/plain		Details
"rich" usage example from command invocation: ./channel-tool.py list 3 years ago Andrew Sutherland [:asuth] (he/him) (deleted), image/png		Details
mozsearch-mozilla euclid html fix 3 years ago Andrew Sutherland [:asuth] (he/him) (deleted), text/x-github-pull-request		Details

Andrew Sutherland [:asuth] (he/him)

Reporter

Description

•

3 years ago

Attached file review failure output from the log (deleted) — Details

We experienced our first insta-check failure today. Not entirely surprisingly this was on the HTML check for the rust third-party euclid def whose JSON output looks sane but whose HTML output is potentially subject to much more churn because the HTML line in question is for pub struct Rect<T, U> and the T and U end up with a lot of implementation-artifact symbol names that look potentially unstable due to inclusion of seemingly arbitrary integer id's.

For this specific situation, I think just removing the "html" check is likely the right solution, but to brainstorm right here in the bug...

Check Failure Workflow Brainstorm

Rationale for the production checks

Goal: validate that there are no regressions in searchfox functionality. This includes:
- Catching regressions due to changes in mozilla-central configuration, such as hard-coded paths changing due to changes in the build process. This has previously happened.
- Catching regressions due to changes in tooling behavior, such as rust's save-analysis mechanism or Gecko-specific IDL compilers. This has previously happened.
Non-goal: Validate specific implementation details.
- Historically we've picked specific symbols to check because we have to check for some symbol, and we tried to pick ones that didn't seem very likely to change.
Goal: Fail hard when things break so they can be addressed quickly.
- This choice was made more out of pragmatism and the VM's development loop not having a warning-reporting mechanism whereas the AWS runs would report both errors and warnings (although I feel like having a low signal-to-noise ratio on the warnings limited things at least for a time.)

Mechanisms that could be used to avoid/reduce breakage hassles

Checks could involve a quorum of checks per subsystem so that a single change in behavior need not immediately be treated as a failure, but instead could be reviewed asynchronously.
- Potential Issue: If it's easy to ignore a regression, it may be ignored for a while. When eventually reviewed, it might be more difficult to identify the specific source of the change (in terms of revision control investigations) and consulting any parties involved in the underlying change (who may have forgotten, etc.)
The searchfox-tool commands could gain additional stabilizing filters like one that strips the "loc" line from the JSON output or normalizes the line number to something like "LINE" but leaves the offset intact as we would expect changes in line number to be much more likely than column changes in the specific line.
- The resulting loss of fidelity could be offset through the existing HTML check variants which also quote the line.
- These would only be used for the production checks; for the mozsearch "tests" repo we would want to keep the line numbers as it would make sense for those to change with changes to the test repo and for the reviewer to validate they are changing/not-changing as expected.
Check failures could be treated as a lower grade of severity to an indexing failure, resulting in the process still completing, but with the web server placed on a "failed" channel instead of its regular channel. This could then be made accessible at https://failed.searchfox.org/ which would allow the https://github.com/mozsearch/mozsearch-mozilla/blob/master/review-build-check-results.sh documented at https://github.com/mozsearch/mozsearch/blob/master/docs/testing-checks.md to still work.
- This could dovetail with my previous proposal that we would let non-release channel builds that fail still make their way to being web-servers for investigation.
- Maybe we accomplish this by making insta-check failures result in a specific error code that the wrapper scripts understand to mean "this failed, if this was a release build, put it on the failed channel, otherwise just keep going but do report that it failed checks in the results email"
Check operations could run in a mode that still reports failure but also accepts all changes so that a git diff/patch can be generated that can be exposed via the web-server and easily propagated to a github pull request.
- Having the indexer directly submit PR's seems like it would involve github auth credentials and therefore sounds like a cross between a hassle and a security risk, but being able to curl the patch and git apply it locally or let some other developer-run script automate it seems feasible and incrementally do-able.

Andrew Sutherland [:asuth] (he/him)

Reporter

Updated

•

3 years ago

Depends on: 1730294

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 1

•

3 years ago

Attached image "rich" usage example from command invocation: ./channel-tool.py list (deleted) — Details

Design decision heads-up: I'm going to add a dependency on https://github.com/willmcgugan/rich as part of an effort to try and provide some level of hierarchy in the output and generally make visual scanning of the state of things a little easier in new tooling. The "rich" library seems pretty amazing in terms of documentation and feature set, including the ability to mix tables into tree uses, as is demonstrated here. This will require re-provisioning when this bug lands.

My plan is to leave ssh.py and its already-quite-good output as-is for the foreseeable future to avoid changing existing working things out from under people and to avoid potential breakage from the extra cross-correlation use-cases that are going on in this command.

edit: It looks like I forgot to include the uptime/runtime age value in the table, which might want to be there too, but the table setup is already running out of screen real estate and that's after folding the "indexer" and "web-server" keys into "(start time)"

other edit: I meant to post this in bug 1730294 but the goal was mainly to provide some heads-up to component-watchers.

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 2

•

3 years ago

Attached file mozsearch-mozilla euclid html fix (deleted) — Details

The fancy machinery proposed here is going to take a little longer, so I've landed https://github.com/mozsearch/mozsearch-mozilla/pull/150 which removes the brittle HTML check as fixing the current check failure. Tonight's UTC22 run should have the revised check and pass.

Andrew Sutherland [:asuth] (he/him)

Reporter

Updated

•

2 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1802054

Andrew Sutherland [:asuth] (he/him)

Reporter

Updated

•

2 years ago

Assignee: bugmail → nobody

Status: ASSIGNED → NEW

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Improve release and dev config insta-check test failure experience and check workflow (motivation: check failure in `check_glob@rust__third_partyeucliddef__html.snap`)

Categories

(Webtools :: Searchfox, enhancement)

Tracking

(Not tracked)

People

(Reporter: asuth, Unassigned)

References

(Depends on 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(3 files)

Description

Check Failure Workflow Brainstorm

Rationale for the production checks

Mechanisms that could be used to avoid/reduce breakage hassles

Updated

Comment 1

Comment 2

Updated

Updated

Attachment

General

Description

File Name

Content Type

Improve release and dev config insta-check test failure experience and check workflow (motivation: check failure in `check_glob@rust__third_party__euclid__def__html.snap`)

Description

Check Failure Workflow Brainstorm

Rationale for the production checks

Mechanisms that could be used to avoid/reduce breakage hassles

Updated

Comment 1

Comment 2

Updated

Updated

Improve release and dev config insta-check test failure experience and check workflow (motivation: check failure in `check_glob@rust__third_partyeucliddef__html.snap`)