138117 - Completed downloads are not removed from Cache folder

Reporter

Description

•

23 years ago

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:0.9.9+) Gecko/20020417 BuildID: 2002041703 After completing a download, the cached file is not removed from the cache folder. The cache folder will continue to grow no matter what the size limit is set to in the prefs. Reproducible: Always Steps to Reproduce: 1. Clean out the cache folder (so that you can observer the difference in step 3) 2. Go to the provided link and choose Save As... or Open With... 3. When the file is finished, check the cache folder. Actual Results: The cached file (of approximately 16 megs) is still in the cache folder, even though the download was completed. Expected Results: The cached file should be removed upon the completion of the download This is a major bug since you could conceivably lose a lot of space very quickly (like me downloading nightlies everyday for a week, plus Chimera builds). I think that this would best be fixed by removing the cached file if the download is interupted (via loss of connection, crash of Mozilla, etc) or completed. If the download is simply paused, you're obviously going to want to keep the file. This was seen on Mac OS X 10.1.3 and 10.1.4 on a Powerbook G3 500mHz with 12gig hard drive and 512mb RAM.

Erick Wong

Comment 1

•

23 years ago

I have been experiencing the same problem on MacOS 9, so it's just not specific to Fizilla.

hirata masakazu

Comment 2

•

23 years ago

The archives and encoded files are removed by StuffitExpander after expanding, I suppose. Check the preferences of StuufitExpander.

Prachi Gauriar

Reporter

Comment 3

•

23 years ago

This isn't an issue with Stuffit. This is definitely a Mozilla issue. The completed download should be removed from the cache folder. Stuffit isn't responsible for that, Mozilla is. This really needs to be fixed by Moz 1.0. This is a serious problem in efficiency. Mozilla goes over my cache limit daily because I download a lot of files and none of them are removed from the cache folder.

Erick Wong

Comment 4

•

23 years ago

Just to clarify the problem a bit further: When you download a file, two copies are created: one goes to your designated download folder (say, the desktop), and has a normal filename like "archive.sit". A second copy is saved to the Mozilla cache folder under a hashed filename such as "0183A748d01". The problem is that this second copy is completely ignored by the cache manager and never gets deleted, even with the "Clear Disk Cache" button. Obviously, this leads to a rapid and persistent bloating of the cache folder. I wonder if this bug should be filed under "Networking: Cache"?

Prachi Gauriar

Reporter

Comment 5

•

23 years ago

I totally missed that category. I'm moving it. Maybe work will be done on it.

Component: File Handling → Networking: Cache

timeless

Comment 6

•

23 years ago

.

Assignee: law → gordon

QA Contact: sairuh → tever

Prachi Gauriar

Reporter

Comment 7

•

23 years ago

*** This bug has been confirmed by popular vote. ***

Status: UNCONFIRMED → NEW

Ever confirmed: true

ajbu

Comment 8

•

23 years ago

I am also having this problem on Windows XP. Is there a separate bug for Windows on this that I missed, or should this be set to OS: All? Was a real problem for me downloading Linux ISO files, having my boot disk filled up and crashing Windows...

Prachi Gauriar

Reporter

Comment 9

•

23 years ago

I'm going to go ahead and change this to OS->All, Platform->All based on comment 8.

OS: MacOS X → All

Hardware: Macintosh → All

Matthias Versen [:Matti]

Comment 10

•

23 years ago

*** Bug 142791 has been marked as a duplicate of this bug. ***

Andrew Hagen

Comment 11

•

22 years ago

Proposed relnote: Downloaded files are never removed from the disc cache. This can be problematic if downloaded files are very large. Workaround: go to the profile directory and manually delete them.

Keywords: mozilla1.0.1, nsbeta1, relnote

Steve Harvey

Comment 12

•

22 years ago

Attached patch suppress storage of downloads in user Cache directory (deleted) — Details — Splinter Review

With pre-downloading, a download may momentarily require storage for as many as three copies, depending upon the placement of the user's Cache directory, the temporary directory in use for the downloads, and the user-specified target. Despite its disclaimer of authoritativeness, the glossary accessible to the end-user via the Help menu defines cache as "A collection of web page copies..." . This would seem to obviate putting download data into the user's Cache directory. This patch will suppress storage of download data within the user's Cache directory. Not scrubbing the metadata from the Cache doesn't seem to affect things under Linux, while minimizing the change to the codebase.

Andrew Hagen

Updated

•

22 years ago

Keywords: patch

Tom Everingham

Comment 13

•

22 years ago

*** Bug 155298 has been marked as a duplicate of this bug. ***

gordon

Comment 14

•

22 years ago

Darin, any comments on the attachment? If Javascript is involved in the download process, it's lazy garbage collection may allow cache entry descriptors to stay in use much longer than necessary.

gordon

Updated

•

22 years ago

Priority: -- → P1

Target Milestone: --- → mozilla1.2beta

OstGote!

Comment 15

•

22 years ago

As I understand it this bug is now only relevant to the 1.0 branch. The trunk builds since 1.2 don't have this problem (the removing) anymore.

Andrew Hagen

Updated

•

22 years ago

Keywords: mozilla1.0.1

Version: Trunk → 1.0 Branch

gordon

Comment 16

•

22 years ago

Let mark this fixed then.

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

benc

Comment 17

•

22 years ago

If this happens on the 1.0 branch, it should stay open, while the version says "1.0 branch". I've cleared the milestone and sent this to Download Manager.

Status: RESOLVED → REOPENED

Component: Networking: Cache → Download Manager

QA Contact: tever → petersen

Resolution: FIXED → ---

Target Milestone: mozilla1.2beta → ---

gordon

Comment 18

•

22 years ago

reassigning to owner of Download component.

Assignee: gordon → blaker

Status: REOPENED → NEW

Samir Gehani

Comment 19

•

22 years ago

Nav triage team: removed nomination. Not relevant to trunk.

Keywords: nsbeta1

OstGote!

Comment 20

•

21 years ago

(In reply to comment #15) > As I understand it this bug is now only relevant to the 1.0 branch. The trunk > builds since 1.2 don't have this problem (the removing) anymore. To make it clear: With trunk builds downloads are removed if cache limit is reached, but not directly after the download is completed. So the cache doesn't grow infinitely but the files are not deleted if the disk cache limit isn't reached yet. I tested it now with Mozilla 1.0.2 and this release shows the same behavior as the trunk builds, so IMO this bug could be resolved. The deletion is not directly after download as the reporter wanted in comment 0, but this is IMHO a minor issue (the real problem was the size limit exceeding, see dupe). A fix for bug 55307 would solve that. If not, it should be filed a new bug (for trunk) to do that.

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: Browser → Seamonkey

John Vandenberg

Comment 21

•

19 years ago

*** Bug 289890 has been marked as a duplicate of this bug. ***

John Vandenberg

Comment 22

•

19 years ago

*** Bug 270519 has been marked as a duplicate of this bug. ***

John Vandenberg

Comment 23

•

19 years ago

I can reproduce this on FF1.5b1 and SeaMonkey 1.0a. Re: bug 270247 comment 3, using LiveHTTPHeaders confirms that the cache content is not being validated on subsequent requests to the same URL; no HEAD request is made, so afiacs (please correct me if I am wrong) this problem does in fact cause all current releases of Mozilla products to ignore the HTTP standard, section 9.3 GET: The response to a GET request is cacheable if and only if it meets the requirements for HTTP caching described in section 13. 13.1.4 Explicit User Agent Warnings ... The user agent SHOULD NOT default to either non-transparent behavior, or behavior that results in abnormally ineffective caching, but MAY be explicitly configured to do so by an explicit action of the user.

John Vandenberg

Comment 24

•

19 years ago

*** Bug 270247 has been marked as a duplicate of this bug. ***

allen.attard

Comment 25

•

19 years ago

Aside from the cache limit issue, this bug means that if a file is downloaded, an updated version of it cannot be downloaded without clearing the entire cache. This makes mozilla quite a hassle to use when working with files that can be frequently modified. Having to regularly clear the cache to ensure that only current versions of files are downloaded defeats the usefulness of having a cache in the first place.

Darin Fisher

Comment 26

•

19 years ago

Can someone please provide specific steps to reproduce this problem? A link to a file to download, with details about what goes wrong would be appreciated. I personally have never experienced this problem. Firefox appears to follow the HTTP/1.1 specification's cache rules to the letter (with only some minor exceptions).

John Vandenberg

Comment 27

•

19 years ago

The problem is when an old file is updated on the remove server, but it is still in the browser disk cache. Steps to reproduce: 1. set browser.cache.check_doc_frequency to 3 2. upload a file to a web server, and set the modification time to 1/1/2005 3. fetch the file 4. upload a version of the file 5. fetch the file again 6. restart the browser 7. fetch the file again Steps 2 & 4 illustrate a use case, rather than the cause; the problem can also be seen with a static file and LiveHTTPHeaders or a network analyser. http://ftp.mozilla.org/pub/mozilla.org/mozilla/nightly/latest/mozilla-win32-stub-installer.exe Expected results: ftp.mozilla.org doesn't appear to provide an expiry time or other age-controlling header, so I would expect the cache to validate its copy. Actual results: the browser cache returns the stale version. For my download of mozilla-win32-stub-installer.exe in FF1.5b, the cache entry has an expiry date nine days after when it was last fetched. There are three workarounds: 1. set browser.cache.check_doc_frequency to 1, or 2. hold shift and right click on the link, and select Save Link As... 3. the webserver can explicitly require regular revalidation, but this needs to have been set up before the first download. The problems on the last three duplicates are identical, and would appear to be resolved if `Downloads' are removed from the cache. If this is not the right home for the problem, should I re-open bug 270247 as an enhancement to deal with this scenario?

Darin Fisher

Comment 28

•

19 years ago

The use case is working as designed. HTTP/1.1 says that a document served with a Last-modified header may be cached for a period of time determined heuristically by the browser. A value of 1/10th the period between now and when the file was last modified is recommended. So, when the server hosts a file like this, it is saying to all browsers that the file will not change again for a long time. Are you suggesting that downloading a file should always bypass the browser cache? I recommend marking this bug as invalid or wontfix.

John Vandenberg

Comment 29

•

19 years ago

> Are you suggesting that downloading a file should always > bypass the browser cache? No. I am suggesting that the cache should verify its contents in certain circumstances where the original headers did not provide explicit expiration times. In the scenario when the user has specifically requested a file, and the browser has classified it as a `download', the browser should not break semantic transparency; it should request a HEAD to be sure what it is giving the user is actually a download of the file they requested. The file could have been removed, the server could be down; etc. Once the cache contents are validated, the download should proceed using the cache. > I recommend marking this bug as invalid or wontfix. The problem in the original description appears to have been resolved. IMO, a toggle to disable the cache for files elsewhere saved to disk would be a useful improvement; for the same reason that bug 81640 is a P1.

Aaron Slunt

Comment 30

•

19 years ago

Are we forgetting about corrupted downloads here? It does happen, and saving a corrupted download out of the cache is a very dumb thing.

Darin Fisher

Comment 31

•

19 years ago

I can see why it might be nice to force an end-to-end cache validation when downloading an item, but I'm not sure that I would implement that for every link click. Most downloads start from a link click that results in a file that the browser cannot render. In those cases, it would be bad to restart the download because it is hard to know if a download can be restarted without side-effects. So, if we only validate explicit downloads (file->save as), then we are not being consistent. > In the scenario when the user has specifically requested a file, and the > browser has classified it as a `download', the browser should not break > semantic transparency... We're not breaking semantic transparency -- at least not according to RFC 2616.

John Vandenberg

Comment 32

•

19 years ago

> We're not breaking semantic transparency The user agent is not fetching the file that is on the server, and its not informing the user that it is not doing what they requested. Whenever a browser doesnt perform exactly like wget (barring bugs of course), its breaking semantic transparency; which is ok, but it should only do this with good cause (i.e. the user has specifically requested this e.g. user pref.), and it should keep the user in the loop. Irrespective of http caching issues, it is reasonable that files that are listed in the Download Manager dont need to be also retained in the disk cache. This bug only relates to items in the cache that are also in the Download Manager. A use case that is more pertinant to this bug: 1. Set a cache size of 1000 KiB, and clear the cache 2. Load this page in one tab: http://outreach.jach.hawaii.edu/pressroom/2003-estar/ 3. Do a little browsing of images.google.com (small images only) in another tab until the disk cache is full 4. View the contents of the disk cache: about:cache?device=disk 5. Go back to the first tab, and right-click on estardiagram-large.png (full size PNG 230kB), and save it to disk 6. Refresh about:cache?device=disk. 7. Do a little browsing of images.google.com (small images only), and keep an eye on the cache contents. Expected results: No change to the existing items in the cache at step 5. Actual results: 25% of the disk cache has been replaced with an image that has been saved outside of the cache. In Step 7, the 230kb image stays in my cache for quite a while, pushing out lots of useful little files.

Christian :Biesinger (don't email me, ping me on IRC)

Comment 33

•

19 years ago

(In reply to comment #29) > it should request a HEAD to be sure what it is giving the user is > actually a download of the file they requested. (surely you mean GET with If-Modified-Since, or with If-None-Match) > it is reasonable that files that are listed > in the Download Manager dont need to be also retained in the disk cache. What if I use File|Save Page As? Surely that shouldn't remove files from the cache. So that statement in its generally is not useful.

John Vandenberg

Comment 34

•

19 years ago

> (surely you mean GET with If-Modified-Since, or with If-None-Match) Yes; HEAD was illustrative. > What if I use File|Save Page As? I did not realise that operation populated the Download Manager. This behaviour seems broken to me (btw bug 143949 intends to fix this for images), and feels like it is a artifact from the days when View Source and Save Page As really did download fresh copies. Are there scenarios when a Save Page As wont be coming out of the cache?

Reed Loden [:reed]

Updated

•

18 years ago

Assignee: bross2 → download-manager

QA Contact: chrispetersen

AMPONSAH

Comment 35

•

16 years ago

GOOD WORK

Robert Kaiser

Updated

•

15 years ago

Assignee: download → nobody

Component: Download & File Handling → Download Manager

Product: SeaMonkey → Toolkit

QA Contact: download.manager

Version: 1.0 Branch → unspecified

Sylvestre Ledru [:Sylvestre]

Comment 37

•

6 years ago

Moving to p3 because no activity for at least 24 weeks.

Priority: P1 → P3

Marco Bonardo [:mak]

Updated

•

5 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1472482

BugBot [:suhaib / :marco/ :calixte]

Comment 38

•

2 years ago

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: critical → --