Open Bug 1661556 Opened 4 years ago Updated 4 years ago

Non-staged complete update does not recover backup files

Categories

(Toolkit :: Application Update, defect)

defect

Tracking

()

People

(Reporter: agashlin, Unassigned)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [iu_tracking])

Attachments

(1 file)

Attached file update.log from a failed update (deleted) —

When a non-staged complete update fails while adding files, it doesn't properly restore those files that it had already replaced.

  1. RemoveFile::Execute (from the precomplete manifest) renames the removed file to a backup (backup_create moves the existing file)
  2. AddFile::Execute (from the MAR's manifest) doesn't create a backup unless there was a file to replace
  3. AddFile::Execute writes the new files, at some point this operation fails (e.g. disk space exhausted), so this step is aborted and the Finishes below run with nonzero status.
  4. RemoveFile::Finish restores the backup.
  5. AddFile::Finish tries to undo itself, removing the file it thinks it added, but this is actually the original file.
  6. AddFile::Finish also tries to restore any backup, but there isn't one by this time.

Any file that had been successfully added in 3 will therefore be removed in 5, leaving the installation incomplete.

There is only one backup per file, and whichever of RemoveFile::Finish or AddFile::Finish runs first will try to use that backup if it thinks it needs to undo it. Undoing the actions would more logically be in reverse order, but I'm not sure if that would cause other issues.

While a non-staged and complete update situation should be relatively rare, I think it is what we see some bug 315278 dupes when disk space runs out (though not in the original report, as that far predates bug 386760 landing in 860d857e899d, which I think introduced this issue). After staged and partial fail, we'd fall back to non-staged complete.

I could see how this could happen when disk space is low. Is the log from a case where the files weren't restored correctly? Bug 635834 and the linked bugs in comments and references has most of the history when fixing this type of issue. Analyzing the exe and dll versions, etc. as was done in a few of those bugs might provide some insight as well.

Yes, the log (provided by Gijs) was from an install that ran out of space, there was only about 300MB left on the drive when he checked. An earlier staged attempt had this in the log (70 hex is ERROR_DISK_FULL):

ensure_copy: failed to copy the file C:\Program Files\Nightly/xul.dll over to C:\Program Files\Nightly\updated/xul.dll, lasterr: 70
failed: 61

The final installation directory was consistent with my analysis here, everything before browser/omni.ja in the manifest was missing, including firefox.exe. All the remaining files were identical to the version that was trying to update (2020-08-21-03-37-46 firefox-81.0a1.en-US.win64).

As I understand it I don't think this issue (remove + add actions touching the same file, due to precomplete) would cause a mismatched install, and given that there's only one patch action per file (I think?) this shouldn't happen with those, either.

Thanks and makes sense. Fixing the low disk space bug ( bug 315278 ) would likely fix this bug as well then but might very well be more difficult to fix.

Whiteboard: [iu_tracking]

The severity field is not set for this bug.
:rachel, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(rtublitz)
Severity: -- → S3
Flags: needinfo?(rtublitz)
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: