Closed Bug 1641939 Opened 5 years ago Closed 5 years ago

Signature failed when updating whilst application is running

Categories

(Firefox :: Remote Settings Client, defect)

defect

Tracking

()

VERIFIED FIXED
Firefox 79
Tracking Status
firefox78 + verified
firefox79 --- verified

People

(Reporter: ccomorasu, Assigned: leplatrem)

References

Details

Attachments

(8 files)

Attached file reduced_configuration2.json (deleted) —

[Steps to reproduce]:

  1. Upload on main-preview/search-config staging reduced_configuration1.json attachment 9152799 [details].
  2. Start Firefox with the attached user.js attachment 9152801 [details] (vary test with dump==true and dump=false).
  3. Check that search engines are listed accordingly to the main-preview from staging with reduced_configuration1.json attachment 9152799 [details] applied.
  4. Upload on main-preview/search-config staging reduced_configuration2.json attachment 9152798 [details].
  5. Wait 2-3 minutes while checking the browser console.

[Expected result]:
Browser console lists SearchEngineSelector: Search configuration updated remotely after which you can restart browser or wait 5' as idle
The browser search engine list is updated with the required search engines that are listed in the updated config (step 3)

[Actual result]:
services.settings: main-preview/search-config Signature failed InvalidSignatureError: Invalid content signature (main-preview/search-config) RemoteSettingsClient.jsm:882
services.settings: Signature verified failed for main-preview/search-config. Retry from scratch RemoteSettingsClient.jsm:591
Full log: https://pastebin.com/RNYRmBEy

[Regression range]:
This is a remote settings issue, the regression range is not applicable.

[Note]:

  • It is possible that this bug is a dupe after bug 1640023;
  • Seems that the problem is not reproducible if we are using a different configuration than the reduced_configuration2.json, which adds a non existing search engine.
Summary: Signature failed when updating main-preview/search-config collection → Signature failed when updating from main-preview/search-config collection

I've just seen this on the current nightly (78.0a1, 20200529095426) with the search-config on production main. We deployed an update earlier this evening:

22:00:24.595 services.settings: main/search-config Signature failed  InvalidSignatureError: Invalid content signature (main/search-config) RemoteSettingsClient.jsm:880
22:00:24.632 services.settings: Signature verified failed for main/search-config. Retry from scratch RemoteSettingsClient.jsm:589
22:00:24.709 services.settings: main/search-config Signature failed again InvalidSignatureError: Invalid content signature (main/search-config) RemoteSettingsClient.jsm:880
22:00:24.739 services.settings: InvalidSignatureError: Invalid content signature (main/search-config)
    InvalidSignatureError resource://services-settings/RemoteSettingsClient.jsm:131
    _validateCollectionSignature resource://services-settings/RemoteSettingsClient.jsm:780
    _importChanges resource://services-settings/RemoteSettingsClient.jsm:874
    maybeSync resource://services-settings/RemoteSettingsClient.jsm:592
    pollChanges resource://services-settings/remote-settings.js:310
    receivedBroadcastMessage resource://services-settings/remote-settings.js:464
    receivedBroadcastMessage resource://gre/modules/PushBroadcastService.jsm:263
    receivedBroadcastMessage resource://gre/modules/PushService.jsm:843
    finishHandshake resource://gre/modules/PushServiceWebSocket.jsm:618
    _handleHelloReply resource://gre/modules/PushServiceWebSocket.jsm:661
    _wsOnMessageAvailable resource://gre/modules/PushServiceWebSocket.jsm:1202
    onMessageAvailable resource://gre/modules/PushServiceWebSocket.jsm:111
remote-settings.js:318
22:00:24.740 BroadcastService: receivedBroadcastMessage: handler for remote-settings/monitor_changes threw error: InvalidSignatureError: Invalid content signature (main/search-config)
    InvalidSignatureError resource://services-settings/RemoteSettingsClient.jsm:131
    _validateCollectionSignature resource://services-settings/RemoteSettingsClient.jsm:780
    _importChanges resource://services-settings/RemoteSettingsClient.jsm:874
    maybeSync resource://services-settings/RemoteSettingsClient.jsm:592
    pollChanges resource://services-settings/remote-settings.js:310
    receivedBroadcastMessage resource://services-settings/remote-settings.js:464
    receivedBroadcastMessage resource://gre/modules/PushBroadcastService.jsm:263
    receivedBroadcastMessage resource://gre/modules/PushService.jsm:843
    finishHandshake resource://gre/modules/PushServiceWebSocket.jsm:618
    _handleHelloReply resource://gre/modules/PushServiceWebSocket.jsm:661
    _wsOnMessageAvailable resource://gre/modules/PushServiceWebSocket.jsm:1202
    onMessageAvailable resource://gre/modules/PushServiceWebSocket.jsm:111
PushBroadcastService.jsm:265
Flags: needinfo?(mathieu)
Summary: Signature failed when updating from main-preview/search-config collection → Signature failed when updating whilst application is running

[Tracking Requested - why for this release]: Remote settings might not be pushing out updates correctly.

Attached image sign_error spike after update (deleted) —

The last update was this one: (source)

  "main-workspace/search-config": [
    {
      "timestamp": 1590779207953,
      "datetime": "2020-05-29T19:06:47.945066+00:00",
      "by": "ldap:mconnor@mozilla.com",
      "changes": {
        "delete": 2,
        "update": 562,
        "create": 83
      }
    }
  ],

And we can clearly see a spike of sign_error on Telemetry. Since we don't see much of sign_retry_error it means that clients flushed the local DB and retried successfully.

It seems like a lot to update 562 records, but I have a gut feeling that deletes are causing us issues. I will try to assemble a profile with old data to try to reproduce the pulling of this update locally.

Flags: needinfo?(mathieu)

(In reply to Mathieu Leplatre [:leplatrem] from comment #5)

And we can clearly see a spike of sign_error on Telemetry. Since we don't see much of sign_retry_error it means that clients flushed the local DB and retried successfully.

That's strange. My Firefox Nightly was repeatedly complaining over a couple of days (unfortunately I've restarted it now), and it didn't actually pick up the changes.

Attached image distribution of sign_retry_errors (deleted) —

Indeed, there's something going on Nightly, we see a lot of retry errors.

While trying to reproduce locally, I found a critical bug on the server that causes updates with deletes to fail. https://github.com/Kinto/kinto-changes/issues/175 It will be fixed ASAP.
Nevertheless this bug does not affect retries, so I should investigate more what is going on here exactly.

S1 or S2 bugs need an assignee - could you find someone for this bug?

Flags: needinfo?(mathieu)
Assignee: nobody → mathieu
Status: NEW → ASSIGNED

Comment on attachment 9154017 [details]
Bug 1641939 - Flush local DB after retrying r?standard8

Beta/Release Uplift Approval Request

  • User impact if declined: The code that is in charge of recovering from signature errors is flawed in certain situation. Declining this patch would make Remote Settings sync less robust.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change only impact the part that is in charge of recovering from errors.
  • String changes made/needed:
Flags: needinfo?(mathieu)
Attachment #9154017 - Flags: approval-mozilla-beta?
Attached file reproduce-search-config-error.js (deleted) —

This is the snippet that I used to reproduce the issue.

Note 1: the fix was induced by an issue with the way we handle deletions of records. So I would suggest that you delete records on the server to verify the fix.

Note 2: you may not be able to reproduce the issue anymore, even on old versions of the client, since a part of the fix was already deployed on the server.

Pushed by mleplatre@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a93b27ba6429 Flush local DB after retrying r=Standard8
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 79
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

Comment on attachment 9154017 [details]
Bug 1641939 - Flush local DB after retrying r?standard8

approved for 78.0b5

Attachment #9154017 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9154017 [details]
Bug 1641939 - Flush local DB after retrying r?standard8

This patch conflicts with bug 1640136 which isn't on beta. Can you provide a rebased version?

Flags: needinfo?(mathieu)
Attachment #9154017 - Flags: approval-mozilla-beta+

I can confirm this issue is no longer reproducible, I verified using Fx 79.0a1, on macOS 10.13.6.

I opened this revision in phabricator, rebased on beta, tell me if it's convenient enough:
https://phabricator.services.mozilla.com/D79074

Alternatively we could also uplift Bug 1640136 and Bug 1644153 in the next iteration.

Flags: needinfo?(mathieu) → needinfo?(jcristau)

Comment on attachment 9155571 [details]
Bug 1641939 - Flush local DB after retrying r=standard8

thanks.

Flags: needinfo?(jcristau)
Attachment #9155571 - Flags: approval-mozilla-beta+

I can confirm that this issue is verified fixed using Firefox 78.0b6 (BuildId:20200611192304).

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: