Closed Bug 221797 Opened 21 years ago Closed 21 years ago

Losing history on every Win98 crash due to missing directory flush

Categories

(Core Graveyard :: History: Global, defect)

x86
Windows 98
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Franke, Assigned: Bienvenu)

References

Details

Attachments

(2 files, 1 obsolete file)

User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.5) Gecko/20030925 Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.5) Gecko/20030925 When Win9x crashes with Mozilla open, the history file gets corrupted by scandisk's file size "repair". Reproducible: Always Steps to Reproduce: 1. Start Mozilla and visit URLs. 2. Crash Windows (e.g. Hardware-Reset) 3. Reboot Windows, let scandisk repair the allocation error of history.dat. 4. Start Mozilla 5. Open History Actual Results: History Lost. Expected Results: History preserved, only some recent entries may be lost. Win98 does never update the file size in the directory entry when the file is written in sequential manner without intermediate seeks or reads. After reboot, scandisk repairs the difference between allocation chain length and file size by setting the file size to end of the last cluster. This adds garbage to the end of history.dat. Mozilla clear the corrupted file. A call to FlushFileBuffers() to update the file size would solve this problem. For an example see function update_EOF() in dbm/src/hash.c (http://lxr.mozilla.org/seamonkey/source/dbm/src/hash.c#617) For a method to recover the history see Bug 113378, comment 10.
This may work or not: morkConfig.h: #if defined(MORK_WIN) void mork_fileflush(FILE * file); #define MORK_FILEFLUSH(file) mork_fileflush(file) #else #define MORK_FILEFLUSH(file) fflush(file) #endif /*MORK_WIN*/ morkConfig.cpp: #if defined(MORK_WIN) #include <whatever/needed.h> void mork_fileflush(FILE * file) { fflush(file); int fd = fileno(file); HANDLE fh = (HANDLE)_get_osfhandle(fd); FlushFileBuffers(fh); } #endif /*MORK_WIN*/ (Sorry, but I don't have Moz.Development installed. Perhaps someone can test this ;-)
*** This bug has been marked as a duplicate of 63292 ***
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → DUPLICATE
Summary: Loosing history on every Win98 crash due to missing directory flush → Losing history on every Win98 crash due to missing directory flush
Re comment 2: Sorry for possible spam, but this is IMO not a dup, because Bug 63292 does not address this Win9x-only issue (no problem on W2K/XP or other OS). See also Bug 63292, comment 75.
The program writes 10 Blocks of 1000000 Bytes and displays the file size reported by the directory functions (stat()/FindFirstFile()) after each Block. After Block 3 and 7, fsync()/FlushFileBuffers() is called. For the remaining blocks, the programs waits up to 20 seconds for size match. Results: Win98SE: Size in dir is only updated by FlushFileBuffers() and CloseHandle(). WinXP-FAT: Slow (about 130Kb/sec) and unrelieable update of size in dir, FlushFileBuffers() recommended. WinXP-NTFS: Fast update of size in dir. Linux: No problem ;-)
*** Bug 223781 has been marked as a duplicate of this bug. ***
reopen b/c duplicate
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
and confirming b/c duplicate
Status: UNCONFIRMED → NEW
Ever confirmed: true
Re comment 7: Thanks for confirming. Same problem and same fix applies to panacea.dat and Mail/*.msf files blocking1.4.2? and 1.6b?
Flags: blocking1.6b?
Flags: blocking1.4.2?
Blocks: 19454
No longer blocks: 19454
Who knows mork and can help us here? bienvenu, jst, can you look at this patch?
sure, I can try this - the patch looks OK.
I worry a little that it might slow us down on windows, but I'll try to see.
Attached patch cvs diff (obsolete) (deleted) — Splinter Review
this fix will work - however, it will cause excessive disk flushing on windows, aka grinding, because we commit quite frequently. Also, do we even support win98 anymore?
The following change prevents excessive disk flushing on WinNT/2K/XP: void mork_fileflush(FILE * file) { fflush(file); + if (!(GetVersion() & 0x80000000)) + return; // WinNT/2K/XP + // Win9x/ME int fd = fileno(file); HANDLE fh = (HANDLE)_get_osfhandle(fd); FlushFileBuffers(fh); }
thx very much, I'll try that. Personally, I don't think we need the flushing on win NT because NT is much better about cleaning itself up when apps crash - blue screen of death is very rare, at least compared to win98...
msdn about GetVersion: "This function has been superseded by GetVersionEx, which is the preferred method for obtaining system version number information. New applications should use GetVersionEx. "
That's correct, here a version using GetVersionEx: + OSVERSIONINFOA vi = { sizeof(OSVERSIONINFOA) }; + if (!(GetVersionExA(&vi) && vi.dwPlatformId == VER_PLATFORM_WIN32_WINDOWS)) + return; // WinNT/2K/XP/unknown + // Win9x/ME (BTW: MSVC 6.0 CRT itself uses GetVersion(), so it is very unlikely to disappear from KERNEL32.DLL ;-) Re comment 14: Yes, the problem with missing updates of file size on open files only occurs with Win9x (aka DOS) file system. I only found 2 reliable methods to force size update: call FlushFileBuffers() or close/reopen the file. The latter is much faster but impractical here.
why is closing/reopening much faster? Shouldn't that also flush all the file buffers, i.e., do more work?
FlushFileBuffers() would not return until file/dir/FAT buffers are flushed to disk but close/reopen() leaves physical flushing to standard write behind routine.
So what is the state of the patch we have here. Is there going to be observable performance impact? Will this be confined to win98? Is this patch low-risk enough to taked this late in 1.6? Should we hold off until we open for 1.7.
there will be a noticeable performance impact on win98 (and win ME?). I think it won't be too noticeable with normal operations (downloading mail, reading mail, etc) but some operations, like downloading imap or news for offline use might cause quite a bit of disk rattling, due to over-committing of the db. I can go either way w.r.t. checking into 1.6 vs 1.7. My feeling is that what's going to happen is that once this gets checked in, some hot spots are going to be found on win98 that have to be fixed by reducing commits. Is that risk worth avoiding the history.dat corruption when win98 crashes? Probably...I'll attach the latest version of the patch.
Attached patch latest patch (deleted) — Splinter Review
Attachment #136314 - Attachment is obsolete: true
Who can review this patch? If it doesn't make beta (unlikely) it sounds like we probably wouldn't want to take the risk into final since we wouldn't have the chance to respond to those hotspots.
Comment on attachment 136736 [details] [diff] [review] latest patch I'll review it, and request sr from mscott
Attachment #136736 - Flags: superreview?(mscott)
Attachment #136736 - Flags: review+
Attachment #136736 - Flags: superreview?(mscott) → superreview+
Flags: blocking1.6b?
Flags: blocking1.6b-
Flags: blocking1.6+
Comment on attachment 136736 [details] [diff] [review] latest patch a=asa (on behalf of drivers) for checkin to 1.6.
Attachment #136736 - Flags: approval1.6+
I'll check this in as soon as I can. maybe tonight.
taking.
Assignee: history → bienvenu
fix checked in last night (12/10/2003)
Status: NEW → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
To reduce HD usage peaks on win9x, I suggest we should use some timer, i.e. call FlushFileBuffers() every n seconds (10?) if a flush is necessary. In the case of a windows crash, some history will be lost, but that'll only be that of the last 10 seconds. That, inmo, is a good tradeoff between a total loss of last session's history, and a (yet to be confirmed) dramatic increase of hd usage on win9x.
Blocks: 231606
Blocks: 231919
Exactly. And why didn't you use the close/reopen apporoach when it is faster? Somebody said it is impractical. But I think it is the same: flush <=> close/reopen in the same place in the code. It is easier than close an reopen only once needed, in other place of code. The OS will flush the closed/reopened file a bit later than the flushfile command, but in case of win98 this looks like only 2secs. I wouldn't worry about that.
This would be nice for 1.4.2 I think.
Flags: blocking1.4.2? → blocking1.4.2+
Great fix. The filesizes reported by a file manager are now the same as the ones indicated in Mozilla process file handles. No disk thrashing so far.
*** Bug 227257 has been marked as a duplicate of this bug. ***
This looks like the fix went into the mork db backend and should therefore work on all files that use this format. For some reason, it doesn't seem to apply to the addressbook *.mab files. E.g. the filesize is reported as 0 bytes, but opening the file shows there already is some data (of course while Moz is running).
Forget it, I tested it over a network share and it looked wrong. Checking it on a local machine yielded the correct results - the filesize updated soon after any changes. Sorry.
I have now found i disk thrashing situation with this: when the history is set to larger than 90 days, the history.dat file has grown to 50MB and it is beeing rewritten after every change (like truncating and filled again), or when closing a navigator window. This thrashes the disk and waits until the file is written, only then can seamonkey be used again. The machine has 384 MB RAM, but this is a hard flush to disk, so any (abundant) cache is not utilized. Can we do something about it?
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: