Updating on Windows is broken with mingw-w64/clang toolchain
Categories
(Toolkit :: Application Update, defect, P3)
Tracking
()
People
(Reporter: gk, Assigned: tjr)
References
(Blocks 2 open bugs)
Details
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
lizzard
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr68+
|
Details |
When trying to test updater built with the mingw-w64/clang toolchain it turns out that it is not working. Let me quote what Mark and Kathy found out as that describes the problem better than I could:
We discovered that file paths generated using format strings are corrupted. Specifically, calls to NS_tsnprintf() do not work correctly; apparently, because that is a "wide" function a %s when used in the format string is supposed to mean "expect the arg to be of type WCHAR *". Instead, the args are processed as C-style strings which means they get truncated after the first character (at least when using characters that fit within the first 256 Unicode codepoints).
NS_tsnprintf() is a macro that is defined in toolkit/mozapps/update/common/updatedefines.h (all of the code that uses it is related to the updater). We first noticed that problem when we saw a failure inside updater.cpp's WriteToFile() function. The following code from that function fails because it tries to move filename to a bad path.
#if defined(XP_WIN)
NS_tchar dstfilename[MAXPATHLEN] = {NS_T('\0')};
NS_tsnprintf(dstfilename, sizeof(dstfilename) / sizeof(dstfilename[0]),
NS_T("%s\%s"), gPatchDirPath, aFilename);
if (MoveFileExW(filename, dstfilename, MOVEFILE_REPLACE_EXISTING) == 0) {
return false;
}
#endif
The computed path is C\u (the first character from gPatchDirPath followed by the \ and then the first character of aFilename).
On Windows, NS_tsnprintf() is defined as mywcsprintf which is an inline function that uses _vsnwprintf(). We have not traced this bug deeper than that point, but we did verify that the problems disappear if we replace all occurrences of %s with %S in the NS_tsnprintf() format strings. That is a very ugly fix though, and it would be wrong on macOS and Linux.
Comment 1•5 years ago
|
||
I posted my initial findings over here: https://trac.torproject.org/projects/tor/ticket/31567#comment:6
Basically the issue we're having is that libstdc++'s (though not libstdc's) implementation of the various C printf functions have __USE_MINGW_ANSI_STDIO in their implementation and so do not use Microsoft's format string specifiers (where the meaning of %s and %S is context dependent on whether a wide variant of the function is being called). An initial fix to replace the printf calls with the StringCchPrintf calls works in a local test case though apparently not in Tor Browser. I'm currently digging into why.
Comment 2•5 years ago
|
||
It would be interesting to know which variant of _vsnwprintf we end up using. It apparently works with clang-cl, so we could try to make sure we use the same.
Comment 3•5 years ago
|
||
Yeah something rather weird is going on here and I'm not sure what. The only way we should be getting this behaviour is if _vsnwprintf is not being routed to the CRT and is instead calling some internal implementation using ANSI format string specs.
Comment 4•5 years ago
|
||
So I'm tentatively going to claim this is a bug in mingw's implementation of _vsnwprintf in ucrt__vsnwprintf.c. I wrote a quick test program and built with msvc 19.21 and verified in windbg that a value of '5' (which corresponds to UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION | UCRTBASE_PRINTF_LEGACY_WIDE_SPECIFIERS ) while mingw's implementation only passes in '1' ( UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION ). Swapping the 5 to a 1 in the debugger gives us the same broken behavior.
Comment 5•5 years ago
|
||
So it's a mingw-w64 bug. MSVC allows specifying default behavior with additional macros. Those macros would be tricky to support, but the version in static lib should probably match default MSVC behavior. Martin, is there any reason not to change ucrt__vsnwprintf.c?
Reporter | ||
Comment 6•5 years ago
|
||
FWIW building the toolchain with
--- a/mingw-w64-crt/stdio/ucrt__vsnwprintf.c
+++ b/mingw-w64-crt/stdio/ucrt__vsnwprintf.c
@@ -10,6 +10,6 @@
int __cdecl _vsnwprintf(wchar_t * __restrict__ _Dest,size_t _Count,const wchar_t * __restrict__ _Format,va_list _Args) __MINGW_ATTRIB_DEPRECATED_SEC_WARN
{
- return __stdio_common_vswprintf(UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION, _Dest, _Count, _Format, NULL, _Args);
+ return __stdio_common_vswprintf(UCRTBASE_PRINTF_LEGACY_VSPRINTF_NULL_TERMINATION | UCRTBASE_PRINTF_LEGACY_WIDE_SPECIFIERS, _Dest, _Count, _Format, NULL, _Args);
}
int __cdecl (*__MINGW_IMP_SYMBOL(_vsnwprintf))(wchar_t *__restrict__, size_t, const wchar_t *__restrict__, va_list) = _vsnwprintf;
seems to get us past the bug. Tor Browser is still running, too. :) Not sure though, whether that's the right fix and whether other parts needs respective fixing.
Comment 7•5 years ago
|
||
(In reply to Georg Koppen from comment #0)
because that is a "wide" function a %s when used in the format string is supposed to mean "expect the arg to be of type WCHAR *".
This behaviour actually is a violation of the C99 spec which says the exact opposite (according to C99, %s is always for a char pointer), but this is the default behaviour of wide char functions in MSVC, and in mingw-w64 when using msvcrt.dll (and not defining __USE_MINGW_ANSI_STDIO).
We have not traced this bug deeper than that point, but we did verify that the problems disappear if we replace all occurrences of %s with %S in the NS_tsnprintf() format strings. That is a very ugly fix though, and it would be wrong on macOS and Linux.
FWIW, if you know for sure that it's a wchar string, you can use %ls to avoid ambiguity - %ls means wchar strings, both in normal and wide stdio functions, both in C99 spec and reality (even in old MSVC versions). But the C99 spec doesn't have any format for "string in whichever format the current stdio function is using" similar to the MSVC %s format.
(In reply to Jacek Caban from comment #5)
Those macros would be tricky to support
We don't support exactly the ones that MSVC uses, but we do allow configuring it with UCRTBASE_PRINTF_DEFAULT_WIDE, but that only has an effect on the functions that are defined inline in headers.
but the version in static lib should probably match default MSVC behavior. Martin, is there any reason not to change ucrt__vsnwprintf.c?
I did send a patch to this exact effect actually, in 2017, but you argued against it at the time, as we both believed that modern MSVC defaulted to the C99 behaviour. Apparently neither of us actually did check (or we checked incorrectly somehow) though. I'll resend a brushed up version of that patch.
Reporter | ||
Comment 8•5 years ago
|
||
Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).
Comment 9•5 years ago
|
||
(In reply to Georg Koppen from comment #8)
Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).
That's good to hear, because I hadn't had time to actually test compile them myself yet before I posted them :-)
I'll test build them a bit locally and push them, as they were approved, then.
Comment 10•5 years ago
|
||
(In reply to Georg Koppen from comment #8)
Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).
Is this for the toolchain used by Mozilla as well?
Assignee | ||
Comment 11•5 years ago
|
||
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #10)
(In reply to Georg Koppen from comment #8)
Thanks, Martin! The patches helped (I guess we could leave this ticket open to bump the mingw-w64 commit used in the mingw-w64/clang toolchain).
Is this for the toolchain used by Mozilla as well?
We build the MinGW builds with this toolchain; but only Tor ships a browser built by this toolchain to users and exercises this code. (Unless it's hit by tests I haven't enabled yet.)
Comment 12•5 years ago
|
||
Most of the following xpcshell tests (mainly the ones with Success in the filename) that start with mar* hit that code path
https://searchfox.org/mozilla-central/source/toolkit/mozapps/update/tests/unit_base_updater
Assignee | ||
Comment 13•5 years ago
|
||
This picks up 0a1d495478d8ed1a94fc77b9dbb428b7e0372588 needed by
Tor for the updater, but also grabs the next three commits to bring
us up to master (including getting _FORTIFY_SOURCE support)
Comment 14•5 years ago
|
||
(In reply to Tom Ritter [:tjr] from comment #13)
Created attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#buildThis picks up 0a1d495478d8ed1a94fc77b9dbb428b7e0372588 needed by
Tor for the updater, but also grabs the next three commits to bring
us up to master (including getting _FORTIFY_SOURCE support)
The new fortify source breaks when used with clang from C++ code, see https://sourceforge.net/p/mingw-w64/mailman/message/36754425/, so I'd suggest holding off of that for now.
Updated•5 years ago
|
Assignee | ||
Comment 15•5 years ago
|
||
[Tracking Requested - why for this release]: Needed for mingw-clang build
Comment 16•5 years ago
|
||
Assignee | ||
Comment 17•5 years ago
|
||
Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Update mingw version for mingw-clang build
- User impact if declined: Our builds will diverge from Tor's
- Fix Landed on Version: 70
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Only affects mingw-clang build
- String or UUID changes made by this patch:
Comment 18•5 years ago
|
||
bugherder |
Comment 19•5 years ago
|
||
Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build
mingw-clang build improvement which doesn't affect builds we ship. Approved for 68.2esr.
Comment 20•5 years ago
|
||
uplift |
Manually replaced the hash in the old file (due to changes by bug 1570240 not being uplifted to ESR 68):
https://hg.mozilla.org/releases/mozilla-esr68/rev/b94820890d75e5a32864ab3371cb3c7f87e31d14
If you want further updates on ESR68 in the future, please mention the hash and where it shall be replaced which will make the process faster than trying to graft the fix and identifying what needs to be done. Thank you.
Can you request beta uplift as well? Not sure that will directly affect anyone's builds but it seems best to keep 70 in line with its corresponding ESR. Thanks.
Assignee | ||
Comment 22•5 years ago
|
||
Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build
Beta/Release Uplift Approval Request
- User impact if declined: Minimal; but ESR will be out of sync with Beta
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky):
- String changes made/needed:
Comment on attachment 9090179 [details]
Bug 1577872 - Bump MinGW version for updater fix r?#build
OK for uplift for the beta 5 build.
Comment 24•5 years ago
|
||
bugherder uplift |
Updated•5 years ago
|
Description
•