Closed Bug 1577872 Opened 5 years ago Closed 5 years ago

Updating on Windows is broken with mingw-w64/clang toolchain

Categories

(Toolkit :: Application Update, defect, P3)

All
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox-esr68 70+ fixed
firefox70 --- fixed
firefox71 --- fixed

People

(Reporter: gk, Assigned: tjr)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

When trying to test updater built with the mingw-w64/clang toolchain it turns out that it is not working. Let me quote what Mark and Kathy found out as that describes the problem better than I could:

We discovered that file paths generated using format strings are corrupted. Specifically, calls to NS_tsnprintf() do not work correctly; apparently, because that is a "wide" function a %s when used in the format string is supposed to mean "expect the arg to be of type WCHAR *". Instead, the args are processed as C-style strings which means they get truncated after the first character (at least when using characters that fit within the first 256 Unicode codepoints).

NS_tsnprintf() is a macro that is defined in toolkit/mozapps/update/common/updatedefines.h (all of the code that uses it is related to the updater). We first noticed that problem when we saw a failure inside updater.cpp's WriteToFile() function. The following code from that function fails because it tries to move filename to a bad path.

#if defined(XP_WIN)
NS_tchar dstfilename[MAXPATHLEN] = {NS_T('\0')};
NS_tsnprintf(dstfilename, sizeof(dstfilename) / sizeof(dstfilename[0]),
NS_T("%s\%s"), gPatchDirPath, aFilename);
if (MoveFileExW(filename, dstfilename, MOVEFILE_REPLACE_EXISTING) == 0) {
return false;
}
#endif

The computed path is C\u (the first character from gPatchDirPath followed by the \ and then the first character of aFilename).

On Windows, NS_tsnprintf() is defined as mywcsprintf which is an inline function that uses _vsnwprintf(). We have not traced this bug deeper than that point, but we did verify that the problems disappear if we replace all occurrences of %s with %S in the NS_tsnprintf() format strings. That is a very ugly fix though, and it would be wrong on macOS and Linux.

I posted my initial findings over here: https://trac.torproject.org/projects/tor/ticket/31567#comment:6

Basically the issue we're having is that libstdc++'s (though not libstdc's) implementation of the various C printf functions have __USE_MINGW_ANSI_STDIO in their implementation and so do not use Microsoft's format string specifiers (where the meaning of %s and %S is context dependent on whether a wide variant of the function is being called). An initial fix to replace the printf calls with the StringCchPrintf calls works in a local test case though apparently not in Tor Browser. I'm currently digging into why.

It would be interesting to know which variant of _vsnwprintf we end up using. It apparently works with clang-cl, so we could try to make sure we use the same.

Yeah something rather weird is going on here and I'm not sure what. The only way we should be getting this behaviour is if _vsnwprintf is not being routed to the CRT and is instead calling some internal implementation using ANSI format string specs.

So I'm tentatively going to claim this is a bug in mingw's implementation of _vsnwprintf in ucrt__vsnwprintf.c. I wrote a quick test program and built with msvc 19.21 and verified in windbg that a value of '5' (which corresponds to UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION | UCRTBASE_PRINTF_LEGACY_WIDE_SPECIFIERS ) while mingw's implementation only passes in '1' ( UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION ). Swapping the 5 to a 1 in the debugger gives us the same broken behavior.

So it's a mingw-w64 bug. MSVC allows specifying default behavior with additional macros. Those macros would be tricky to support, but the version in static lib should probably match default MSVC behavior. Martin, is there any reason not to change ucrt__vsnwprintf.c?

FWIW building the toolchain with

--- a/mingw-w64-crt/stdio/ucrt__vsnwprintf.c
+++ b/mingw-w64-crt/stdio/ucrt__vsnwprintf.c
@@ -10,6 +10,6 @@
 
 int __cdecl _vsnwprintf(wchar_t * __restrict__ _Dest,size_t _Count,const wchar_t * __restrict__ _Format,va_list _Args) __MINGW_ATTRIB_DEPRECATED_SEC_WARN
 {
-  return __stdio_common_vswprintf(UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION, _Dest, _Count, _Format, NULL, _Args);
+  return __stdio_common_vswprintf(UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION | UCRTBASE_PRINTF_LEGACY_WIDE_SPECIFIERS, _Dest, _Count, _Format, NULL, _Args);
 }
 int __cdecl (*__MINGW_IMP_SYMBOL(_vsnwprintf))(wchar_t *__restrict__, size_t, const wchar_t *__restrict__, va_list) = _vsnwprintf;

seems to get us past the bug. Tor Browser is still running, too. :) Not sure though, whether that's the right fix and whether other parts needs respective fixing.

(In reply to Georg Koppen from comment #0)

because that is a "wide" function a %s when used in the format string is supposed to mean "expect the arg to be of type WCHAR *".

This behaviour actually is a violation of the C99 spec which says the exact opposite (according to C99, %s is always for a char pointer), but this is the default behaviour of wide char functions in MSVC, and in mingw-w64 when using msvcrt.dll (and not defining __USE_MINGW_ANSI_STDIO).

We have not traced this bug deeper than that point, but we did verify that the problems disappear if we replace all occurrences of %s with %S in the NS_tsnprintf() format strings. That is a very ugly fix though, and it would be wrong on macOS and Linux.

FWIW, if you know for sure that it's a wchar string, you can use %ls to avoid ambiguity - %ls means wchar strings, both in normal and wide stdio functions, both in C99 spec and reality (even in old MSVC versions). But the C99 spec doesn't have any format for "string in whichever format the current stdio function is using" similar to the MSVC %s format.

(In reply to Jacek Caban from comment #5)

Those macros would be tricky to support

We don't support exactly the ones that MSVC uses, but we do allow configuring it with UCRTBASE_PRINTF_DEFAULT_WIDE, but that only has an effect on the functions that are defined inline in headers.

but the version in static lib should probably match default MSVC behavior. Martin, is there any reason not to change ucrt__vsnwprintf.c?

I did send a patch to this exact effect actually, in 2017, but you argued against it at the time, as we both believed that modern MSVC defaulted to the C99 behaviour. Apparently neither of us actually did check (or we checked incorrectly somehow) though. I'll resend a brushed up version of that patch.

Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).

(In reply to Georg Koppen from comment #8)

Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).

That's good to hear, because I hadn't had time to actually test compile them myself yet before I posted them :-)

I'll test build them a bit locally and push them, as they were approved, then.

(In reply to Georg Koppen from comment #8)

Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).

Is this for the toolchain used by Mozilla as well?

Flags: needinfo?(gk)

(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #10)

(In reply to Georg Koppen from comment #8)

Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).

Is this for the toolchain used by Mozilla as well?

We build the MinGW builds with this toolchain; but only Tor ships a browser built by this toolchain to users and exercises this code. (Unless it's hit by tests I haven't enabled yet.)

Flags: needinfo?(gk)

Most of the following xpcshell tests (mainly the ones with Success in the filename) that start with mar* hit that code path
https://searchfox.org/mozilla-central/source/toolkit/mozapps/update/tests/unit_base_updater

This picks up 0a1d495478d8ed1a94fc77b9dbb428b7e0372588 needed by
Tor for the updater, but also grabs the next three commits to bring
us up to master (including getting _FORTIFY_SOURCE support)

(In reply to Tom Ritter [:tjr] from comment #13)

Created attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build

This picks up 0a1d495478d8ed1a94fc77b9dbb428b7e0372588 needed by
Tor for the updater, but also grabs the next three commits to bring
us up to master (including getting _FORTIFY_SOURCE support)

The new fortify source breaks when used with clang from C++ code, see https://sourceforge.net/p/mingw-w64/mailman/message/36754425/, so I'd suggest holding off of that for now.

Priority: -- → P3

[Tracking Requested - why for this release]: Needed for mingw-clang build

Assignee: nobody → tom
Pushed by tritter@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/66c5c577e5f3 Bump MinGW version for updater fix r=froydnj

Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Update mingw version for mingw-clang build
  • User impact if declined: Our builds will diverge from Tor's
  • Fix Landed on Version: 70
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Only affects mingw-clang build
  • String or UUID changes made by this patch:
Attachment #9090179 - Flags: approval-mozilla-esr68?
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71

Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build

mingw-clang build improvement which doesn't affect builds we ship. Approved for 68.2esr.

Attachment #9090179 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+

Manually replaced the hash in the old file (due to changes by bug 1570240 not being uplifted to ESR 68):
https://hg.mozilla.org/releases/mozilla-esr68/rev/b94820890d75e5a32864ab3371cb3c7f87e31d14

If you want further updates on ESR68 in the future, please mention the hash and where it shall be replaced which will make the process faster than trying to graft the fix and identifying what needs to be done. Thank you.

Can you request beta uplift as well? Not sure that will directly affect anyone's builds but it seems best to keep 70 in line with its corresponding ESR. Thanks.

Flags: needinfo?(tom)

Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build

Beta/Release Uplift Approval Request

  • User impact if declined: Minimal; but ESR will be out of sync with Beta
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky):
  • String changes made/needed:
Flags: needinfo?(tom)
Attachment #9090179 - Flags: approval-mozilla-beta?

Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build

OK for uplift for the beta 5 build.

Attachment #9090179 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: