Closed Bug 569709 Opened 15 years ago Closed 15 years ago

Figure out the max nr of entries we should store in the disk cache.

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.3a5

People

(Reporter: jst, Assigned: ehsan.akhgari)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

This bug is a followup to bug 175600 where the number of disk cache entries was increased from 8192 to 16384. While that increased the number, we don't really have any kind of measurement documented anywhere as to what this number really should be, we know 8192 wasn't enough, but is 16384 enough?
Removing blocker flag, this was never meant to be an alpha 5 blocker.
blocking2.0: alpha5+ → ---
The existence of that flag dates back to bug 110163. Here's what happens inside the cache. Initially, we start with kMinRecordCount records (which is 512), and each time we need more records, we multiply the number of existing ones by two, until we reach kMaxRecordCount, which is now 16384. For each record, we keep 16 bytes for the lifetime of the application. So, currently, with large caches, we might end up using 256KB of memory. The same amount is also stored on disk (plus some overhead). So, to make a long story short, the correct value for this number should be determined by the amount of memory that we're willing to reserve for cache usage on profiles with very large caches. The really bad thing here is that the amount of memory which we're talking about would be allocated right after the browser is started, and will remain allocated (if not grown) throughout the lifetime of the browser. Something else which we need to consider here is the default size of our cache. It's 51200KB right now, which means that if on average, cache entries are 3KB, we will be storing exactly 16384 items. In real life, the average size of cache entries is much larger, which means that we never hit the limit by default. What I say from this point on is entirely my own opinion. On today's computers, we can generously reserve let's say 256MB of disk cache space (we can be smarter, for example we can detect the amount of free disk space, etc., but bear with me). Let's assume an average cache entry size of 3KB (which is stupid). We'd need 5 times more records (81920 to be exact), which would mean that we'd be paying about 1.3MB of memory for storing the cache map if it's in fact that large. The stupidest possible solution is to just pick a magic number again. Clearly, a good value for this depends on the amount of available disk cache space (as configured by the user, *not* as what we ship by default.) It's easy to check the value of the pref at runtime (which we already do elsewhere in the cache code) and let's say devide it by 5K and use the resulting number as the maximum side. And I don't think that we need to cap this value. For example, if someone decides to set the disk space cache capacity to 50GB, they'd be paying 10MB of memory if their cache ever gets 25GB or higher, which looks acceptable to me! :-) Comments and suggestions welcome.
As a personal comment, using the CacheViewer extension with Firefox 3.6.4 (hard-coded to 8192 entries) & with my cache size increased to 512MB, I have noticed an average usage of around ~80MB. The actual distribution of this cache is far more interesting - a third are smaller than 1100 bytes (web bugs, empty favicons/spacers, tiny icon-size images, facebook traffic & near-empty Google Maps tiles), the next third are between 1100 bytes & 8170 bytes (more complicated but vector-only Google Maps tiles, thumbnail images, forum avatars, CSS, JavaScript, favicons) with the final third quickly going up to megabytes (satellite Google Maps tiles, almost all HTML, actual decent-sized images, a few flash videos & their containers). Nearly 70% of my cache is for items under 8KB, which is interesting given the mean average item size is ~10KB. Caching of actual bandwidth-intensive content is under a third the total cache contents. But as usual, this is just one user's results. With my location (Australia) being ~250ms from most servers and with a broadband connection, the main value for the cache to me is the latency decrease for these dozens of small items. Not the data transfer decrease (though that is very welcome due to usage caps being standard here). This kind of scenario should be something that's kept in mind when fiddling with the cache item count. Variable based upon max cache size does sound best. However I don't think there's any way to reasonably guess a proper value for the divider without some more data on the distribution of average cached item sizes over a few thousand users. Sounds like something for Test Pilot. This could also create some interesting historical data to compare a few years down the track, to see if the average item size on the Internet for various MIME types has increased, decreased or remained constant as the web evolves.
Another personal comment.... It seems to me that we are making this more complicated than it should be. Yes, there are many technical reasons that this, that, or the other happens, but at the end of the day, the more simple something is, the better it works and the easier it is to maintain. Right now, we are limiting the user / software by two things: disk space cache setting, and # of objects in cache. Why do we need both? Why would we ever want to put a limit on the # of objects in cache? By default, the size of a user's cache will self-limit the # of objects in the cache. So, the question really is, do we want a user's experience to be limited by a hard-coded number or by their own particular hardware? Maybe I am off-base here, but I am trying to wade through the technical aspects to make the question simple. Simple is always better.
Because we're storing 16 bytes per cache entry in the memory all the time.
Again, so what? Per your comment #2, if a user was actually using 25GB of cache, the memory used would be about 10MB. If a user has 25GB to dedicate to web browser cache, that user is going to have 2GB or more of main memory in 99.99% of the cases. So, in the end, why are we trying to make this equation have 2 variables when it really only needs one? Forest... trees....
We _could_ set the upperbound of kMacRecordCount to a very large value as well (like 2^20). That way, given the case that a user has that many cache entries, they'd be needing 16MB of RAM for storing the record array in memory. But we can't make it limitless, because after a while, the memory requirements would just exceed the amount of address space available on 32-bit architectures.
Exactly! Set it up to where _effectively_ it is limitless, keeping in mind the limitations of _hardware_. Simplicity with a dash of common sense is almost always a good solution. Nice work!
Note to self: if we decide to do something which needs the amount of free space available, use nsILocalFile::GetDiskSpaceAvailable.
Two days ago I built Firefox from source with the new 16384 cache item limit. I also set the disk cache size to 512000 KiB and I had nearly 51200 KiB in the disk cache at the time. Today, my disk cache stats are: Number of entries: 11143 Maximum storage size: 512000 KiB Storage in use: 181086 KiB The 16384 limit seems to be an adequate limit to store several days' worth of browsing. Perhaps the limit could be redoubled for good measure, because some use cases (such as browsing in Google Maps) can cause lots of small files to be downloaded. Today, though, I have noticed that the disk cache is barely growing in size, even though I am going to some sites I haven't visited in a while and there is new content on the pages I have revisited. It could be that expired items are being removed from the cache, or maybe there's a bug that is causing items not to be cached properly when the disk cache grows large.
Blocks: http_cache
I almost forgot to comment here! So, I wrote an extension to get stats on the amount of cache used by web pages (the source is available here: <http://hg.mozilla.org/users/ehsan.akhgari_gmail.com/extensions/file/tip/cachesizetest>), and I ran it on the top 1000 Alexa pages. Here are some statistics: The least amount of cache required by a single web page: 1KB. The max amount of cache required by a single web page: 13716KB. Average size of cache required by a single web page: 697KB. So, given the ~50MB cache size we have right now, we can only store ~73 full web pages in cache before we exhaust its capacity. Maybe we can use this stats to reach a conclusion about the new default limit for the cache?
Oh, and the average size of cache entries was about 605 bytes. (much less than 1KB, but that's the smallest block that our cache gives us.)
Attached file Raw stats (deleted) —
250MB =~ 350 web pages = win?
(In reply to comment #14) > 250MB =~ 350 web pages = win? It definitely is a step towards winning. Let me create a patch...
I suppose that data can help with determining the new *default size* of the disk cache, but I thought this bug was about the new *maximum number* of entries in the cache. After several more days, the number of items in my cache is still less than 12000, far less than the limit of 16384, so it looks like there's some sort of bug limiting the number of entries. In my cache, I can confirm that many files are less than 512 bytes (although I'm not sure the average is as low as 605 bytes). Perhaps it could make sense to make 512 bytes the smallest block allocated to save some disk space.
Attached patch Patch (v1) (deleted) — Splinter Review
This patch increases the default cache size to 250MB, and adjusts the max record count based on the size of the cache, with a cap of 32MB memory used to store the entry map in memory, which should give us up to 2097152 items in the cache.
Assignee: nobody → ehsan
Status: NEW → ASSIGNED
Attachment #449985 - Flags: review?
Attachment #449985 - Flags: review? → review?(jduell.mcbugs)
(In reply to comment #16) > In my cache, I can confirm that many files are less than 512 bytes (although > I'm not sure the average is as low as 605 bytes). Perhaps it could make sense > to make 512 bytes the smallest block allocated to save some disk space. That would be an entirely different bug. Can you please file that?
Comment on attachment 449985 [details] [diff] [review] Patch (v1) >-pref("browser.cache.disk.capacity", 51200); >+pref("browser.cache.disk.capacity", 256000); // 250MB I'd love to make the default cache 5 times bigger. Or better yet, use an algorithm to pick the size based on free disk space, as both IE and Chrome do (see bug 559942). But apparently even doubling the cache size right now causes a perf regression on win32 (see bug 193911 comment 35), so that needs to be fixed before we can land a change like this. The rest of this patch looks good, though, and should be able to land if we leave out the cache size increase. Ehsan: if you have time to look into what's causing the win32 perf regression you'll be my hero for the month :) Just in case you do wind up submitting a patch that changes the default cache size, it's worth also changing the #define of DISK_CACHE_CAPACITY in nsCacheService.cpp to match the new value (as the patch for bug 193911 does). >+void >+nsDiskCacheMap::NotifyCapacityChange(PRUint32 capacity) >+{ >+ // Heuristic 1. average cache entry size is probably around 1KB >+ // Heuristic 2. we don't want more than 32MB reserved to store the record >+ // map in memory. >+ const PRInt32 RECORD_COUNT_LIMIT = 32 * 1024 * 1024 / sizeof(nsDiskCacheRecord); >+ PRInt32 maxRecordCount = PR_MIN(PRInt32(capacity / 1), RECORD_COUNT_LIMIT); Why the divide by 1? Am I unschooled at some C++ hack? Why not just cast the value of the constant expr to PRInt32, so you can have just one variable? +r with the cache size increase removed.
Attachment #449985 - Flags: superreview?(jst)
Attachment #449985 - Flags: review?(jduell.mcbugs)
Attachment #449985 - Flags: review+
(In reply to comment #19) > (From update of attachment 449985 [details] [diff] [review]) > >-pref("browser.cache.disk.capacity", 51200); > >+pref("browser.cache.disk.capacity", 256000); // 250MB > > I'd love to make the default cache 5 times bigger. Or better yet, use an > algorithm to pick the size based on free disk space, as both IE and Chrome do > (see bug 559942). We were thinking about doing this, but it's not clear to me what the details should look like. For example, what percentage of the free disk space should we choose? How should we handle changes in the free disk space? How should be adapt to low disk space conditions? > But apparently even doubling the cache size right now causes > a perf regression on win32 (see bug 193911 comment 35), so that needs to be > fixed before we can land a change like this. I'd totally expect to see a memory usage increase with this patch, if nothing else, because of the increse in max record count. But I'm not sure why would it cause performance regressions... > The rest of this patch looks > good, though, and should be able to land if we leave out the cache size > increase. > > Ehsan: if you have time to look into what's causing the win32 perf regression > you'll be my hero for the month :) Well, I may give it a shot, but frankly I'm already overloaded, so this might not happen soon. Anyway, you got me curious, so I pushed a try server run for performance comparison. > Just in case you do wind up submitting a patch that changes the default cache > size, it's worth also changing the #define of DISK_CACHE_CAPACITY in > nsCacheService.cpp to match the new value (as the patch for bug 193911 does). Good point. > >+void > >+nsDiskCacheMap::NotifyCapacityChange(PRUint32 capacity) > >+{ > >+ // Heuristic 1. average cache entry size is probably around 1KB > >+ // Heuristic 2. we don't want more than 32MB reserved to store the record > >+ // map in memory. > >+ const PRInt32 RECORD_COUNT_LIMIT = 32 * 1024 * 1024 / sizeof(nsDiskCacheRecord); > >+ PRInt32 maxRecordCount = PR_MIN(PRInt32(capacity / 1), RECORD_COUNT_LIMIT); > > Why the divide by 1? Am I unschooled at some C++ hack? Why not just cast the > value of the constant expr to PRInt32, so you can have just one variable? I was just trying to provide an easy way for the next person to look at this to determine what to change if they want to change the average cache item size heuristic (1KB), but obviously I have failed! ;-) BTW, does this really require sr?
Comment on attachment 449985 [details] [diff] [review] Patch (v1) We'd probably be fine following the algorithm that chrome uses to pick the cache size (their code is linked off bug 559942). I follow the intention of RECORD_COUNT_LIMIT, and that's fine. I just don't get why you'd want PRInt32(capacity/1) instead of just PRInt32(capacity). You're right--I suppose no sr is needed for this.
Attachment #449985 - Flags: superreview?(jst)
Here are the perf results that I got from the try build: http://grab.by/4QFE (3e5095185cea is *with* this patch) In short: Tp4 (private bytes): 93.4MB -> 88.6MB (improvement) Tp4 (CPU): 279.13 -> 319.51 (14% regression) Tp4 (RSS): for some reason, both numbers are 0 On linux, I see a few megabytes increase in Tp4 (private bytes). I'm not sure how reliable these numbers are though. All Ts measurements show a lot of improvements with this patch(!). I also see a Tshutdown improvement across the board. Do you have an idea how I can get started on tracking the Tp4 regression? I'm not really sure how to start tracking it down, so any tips are appreciated.
(In reply to comment #21) > (From update of attachment 449985 [details] [diff] [review]) > We'd probably be fine following the algorithm that chrome uses to pick the > cache size (their code is linked off bug 559942). Do you want to file a bug on that? I may take that, provided that we figure out the reason behind the Tp4 regression. > I follow the intention of RECORD_COUNT_LIMIT, and that's fine. I just don't > get why you'd want PRInt32(capacity/1) instead of just PRInt32(capacity). I was merely trying to point out that if you want to check the heuristic to switch to 2KB from 1KB as the average item size, for example, you should just change that 1 to 2. :-) But I can remove it if it's confusing.
(In reply to comment #18) > > That would be an entirely different bug. Can you please file that? I filed bug 571091 on the issue
So, I and Johnathan have a theory about the Tp4 perf regression. We might just be doing more I/O because we have a larger cache. This boils down to: 1. Reading from cache: do we have a bug to make reads from cache async? 2. Writng the data to cache: bug 513074, which is fixed, so this should not be an issue. 3. Writing metadata to the cache: this is bug 549767. I pushed two try server runs, one with my patch, and the other with my patch + the patch from bug 549767. We'll see what happens. This looks like a good theory to me, because the talos web server is local, and actually reading things from disk might be slower than "requesting" them from "network" in that case. Since in that sense, the Talos environment is pretty non-usual, we should ask ourselves if we care about the regression, providing that we can prove that this explains it.
Interesting theory--I hope it's true. We have a patch for async cache reads in bug 513008. Maybe push that to the tryserver, too, and see what happens.
P.S. the bug to base cache size on free disk space in a chrome/IE-like way is bug 559942).
It seems unlikely to me that reading from disk would ever be slower than "requesting them from network". If we can prove that your explanation is valid, the more appropriate question would then be how to resolve the regression, IMHO.
It's quite possible that a localhost TCP read of data from a web server on the same machine is faster than doing local disk access. No real data goes on the wire for localhost (it's all in the kernel), and on an SMP, there may not even be a context switch. Most importantly, there's no waiting for the disk drive to seek.
Ok, I'll buy that, but in what percentage of use cases is the scenario you describe true? Are we really not going to address a non-trivial regression based on the use case you describe?
Wait, I think I may have misunderstood... OOPS
Local network traffic on a machine typically requires the application to pay the cost of a system call, plus the kernel copying the requested data in the application's I/O buffer. Disk reads includes the same cost, in addition to the cost of the actual disk read. And my theory has not been confirmed yet. If it is, I think that the patches for the async read/write bugs cited earlier should resolve the regression. If my theory is wrong, we should investigate more.
I gave this a shot on the try server once again, and this time I got pretty similar numbers from builds with this patch and those without them. Then I compared the individual numbers on mozilla-central, and I saw that those numbers seem to have no correspondence to actual regression as reported by our analysis scripts on dev.tree-management. So, I discussed things with Johnathan, and he also agreed that we should land this to see if it causes those scripts to yell at us. http://hg.mozilla.org/mozilla-central/rev/cc8369aa7bd3
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a5
Version: unspecified → Trunk
Any cache size larger than 64 MiB will overflow and break cache completely (until it's deleted and overflows again). I don't know if that's the reason for speed regression but if you make this switch, it *will* happen for all users after a few days of browsing... To confirm, just do about:cache after test run. If it's showing zero entries in disk cache, tp4 was enough to b0rk it. If it's not showing zero entries yet then it's not broken yet.
(In reply to comment #34) > Any cache size larger than 64 MiB will overflow and break cache completely > (until it's deleted and overflows again). How come?
(In reply to comment #35) > How come? Um... I have to admit that I assumed it to be part of "general knowledge" but never even searched for a bug describing it. Or filed one. All I can say is that I tried to have my own cache bigger and cache quickly failed for me very time. A quick question on irc suggested that 64 MiB is the limit under which it doesn't fail. A "failed" cache has size bigger than the maximum allowed and zero cache entries. No further entries are added because it's full, and no entries are deleted because there's nothing to delete. I can reproduce such cache by just 1) setting my cache size to, say, 100 MB and 2) downloading a file larger than 64 MiB. "Failed" cache can't be recovered by other means than deleting it fully.
So, this landed, and it moved the needle on quite a few of tests. 9 tests observed an improvement: * Linux * Tp4 shutdown (x86) * Tp4 (x86 and x86-64) * Mac * Tp4 (10.5 and 10.6) * Windows * Tp4 shutdown (xp and 7) * Tp4 (xp and 7) 4 tests observed a regression: * Linux * Tp4 private bytes (x86-64) * Windows * Tp4 Memset (win7) * Tp4 (%CPU) (win7) * Tp4 private bytes (win7) I guess I should be happy about the improvements, except that I'm not! My patch should not have that kind of an improvement on anything here. I guess I should be sad about the regressions, but I'm not! The memory regressions could be real, because for large caches, we could be storing a larger map in memory, and therefore consuming more memory that we used to, but that's for a good cause. The Tp4 %CPU regression I really don't understand. On the graph, 6 machines seem to have spiked, while the rest are remaining pretty consistent with older results. I'll hold off on backing anything out, but I *really* don't think that is necessary, given the current state of things. I'll link dev.tree-management to this comment here.
On comment #36, I've had the same problem. Switch to a dial-up connection--which I was using regularly until just a few weeks ago--and you'll find very quickly that the cache isn't working when you try to increase its size (IIRC, I tried sizes like 256/512 MB and 1/2/3 GB; I never tried a size under 64 MB). Loss of caching is *very* noticeable. After trying to increase it several times over the last few years, I ended up just having to live with the ridiculously small 50 MB cache. From experience, that and Bug 105843 can make Firefox a nightmare for dial-up users if you want to browse bandwidth-intensive sites (like Facebook). If you're running on broadband though, these bugs are barely noticeable.
Comment #36 seems to hint at a real bug: caching something larger than 64MB in the cache gives problems apparently. Can someone, who is able to reproduce this, create a bug for this, including a description how to reproduce it.
(In reply to comment #35) > (In reply to comment #34) > > Any cache size larger than 64 MiB will overflow and break cache completely > > (until it's deleted and overflows again). > > How come? (In reply to comment #36) > (In reply to comment #35) > > How come? > > Um... I have to admit that I assumed it to be part of "general knowledge" but > never even searched for a bug describing it. Or filed one. > > All I can say is that I tried to have my own cache bigger and cache quickly > failed for me very time. A quick question on irc suggested that 64 MiB is the > limit under which it doesn't fail. > > A "failed" cache has size bigger than the maximum allowed and zero cache > entries. No further entries are added because it's full, and no entries are > deleted because there's nothing to delete. > I can reproduce such cache by just 1) setting my cache size to, say, 100 MB and > 2) downloading a file larger than 64 MiB. > "Failed" cache can't be recovered by other means than deleting it fully. Bug 443067 overflow issue has been fixed. There are several other bugs around this Disk cache issue (bug 218391, bug 230125) which I suppose might be dupes of the first.
While the 64mb limit may have been an issue in the past, don't forget that the problem was largely caused by the limit of 'objects' in the cache, 8,192. That limit was doubled .. so do we really have a limiting issue now at 64mb ? Since this bug landed, I've not seen any problems. Currently: Number of entries: 6377 Maximum storage size: 256000 KiB Storage in use: 108105 KiB
(In reply to comment #37) > I guess I should be happy about the improvements, except that I'm not! My > patch should not have that kind of an improvement on anything here. Why is the Tp improvement not expected? Doesn't that test reload common elements that might be kept in cache? > I guess I should be sad about the regressions, but I'm not! The memory What's the size of the regression? Adding memory usage here seems acceptable and understood to me, but I'd like to know if we're talking about a huge bump up or not.
(In reply to comment #38) > Switch to a dial-up connection--which I was using regularly until just a few > weeks ago--and you'll find very quickly that the cache isn't working when you > try to increase its size (IIRC, I tried sizes like 256/512 MB and 1/2/3 GB; I > never tried a size under 64 MB). Loss of caching is *very* noticeable. After > trying to increase it several times over the last few years, I ended up just > having to live with the ridiculously small 50 MB cache. The first part of the problem you're mentioning should be fixed by this patch. Bug 105843 is a very real and annoying issue (it annoyed me once so much that I wrote a patch for it, but it's not quite ready for production yet), but I'm not sure when that bug would be resolved.
(In reply to comment #42) > (In reply to comment #37) > > I guess I should be happy about the improvements, except that I'm not! My > > patch should not have that kind of an improvement on anything here. > > Why is the Tp improvement not expected? Doesn't that test reload common > elements that might be kept in cache? The set of pages in Tp shouldn't have too much shared resources, if any. And at any rate, Tp4 really doesn't represent any real and useful use case, at least as far the cache is concerned, because the majority of our users do not browse the web off of a locally hosted web server. That's the main reason that I'm not happy about the improvements. > > I guess I should be sad about the regressions, but I'm not! The memory > > What's the size of the regression? Adding memory usage here seems acceptable > and understood to me, but I'd like to know if we're talking about a huge bump > up or not. For those who don't watch dev.platform, please see here: <http://groups.google.com/group/mozilla.dev.platform/msg/d6bdd58dd8f4f4fa>
> The set of pages in Tp shouldn't have too much shared resources, if any. Yes, but each page is loaded 10 times during the test. So if the cache holds things for long enough to come around full cycle, it could help. Historically, the Tp needle has definitely moved when we cached more stuff.
(In reply to comment #45) > > The set of pages in Tp shouldn't have too much shared resources, if any. > > Yes, but each page is loaded 10 times during the test. So if the cache holds > things for long enough to come around full cycle, it could help. Historically, > the Tp needle has definitely moved when we cached more stuff. You're right, that makes sense. Thanks for clearing it up.
So, I reverted the cache size change because of Tp4 regressions on Linux: <http://hg.mozilla.org/mozilla-central/rev/cd7d6f94d41e> I'm leaving this as FIXED, because increasing the cache size wasn't really part of this bug here. I'm afraid that I can't invest the time needed to track this down in the near future.
In the backout msg on tbpl it states regression on Win7 ? Is it across all platforms, Win7 or just Linux? I don't understand leaving it 'fixed' ? Does that mean if a user wants to bump up to max of 250mb of cache he can do so without fear ?
(In reply to comment #48) > So, I reverted the cache size change because of Tp4 regressions on Linux: > > <http://hg.mozilla.org/mozilla-central/rev/cd7d6f94d41e> > > I'm leaving this as FIXED, because increasing the cache size wasn't really part > of this bug here. I'm afraid that I can't invest the time needed to track this > down in the near future. I guess that means I should undupe bug 193911...
(In reply to comment #49) > In the backout msg on tbpl it states regression on Win7 ? Is it across all > platforms, Win7 or just Linux? No, it's Win7 only. > I don't understand leaving it 'fixed' ? Does that mean if a user wants to bump > up to max of 250mb of cache he can do so without fear ? This bug was about adjusting the max entry count value based on the capacity of the cache, and that part is fixed, so yes, you could bump your cache size and benefit from the entry count fix. (In reply to comment #50) > (In reply to comment #48) > > So, I reverted the cache size change because of Tp4 regressions on Linux: > > > > <http://hg.mozilla.org/mozilla-central/rev/cd7d6f94d41e> > > > > I'm leaving this as FIXED, because increasing the cache size wasn't really part > > of this bug here. I'm afraid that I can't invest the time needed to track this > > down in the near future. > I guess that means I should undupe bug 193911... That's right.
How bad was the Win7 regression? Do we have any pointers on how to proceed with narrowing down what's going on? Seems like it might be worth opening a bug for it (with as much info dump as possible) now that this one is fixed.
(In reply to comment #52) > How bad was the Win7 regression? Do we have any pointers on how to proceed > with narrowing down what's going on? Seems like it might be worth opening a > bug for it (with as much info dump as possible) now that this one is fixed. About a 30% CPU usage regression which is pretty high IMO. Someone should probably dig in with a profiler to figure out what is going on with it.
(In reply to comment #52) > How bad was the Win7 regression? Do we have any pointers on how to proceed > with narrowing down what's going on? Seems like it might be worth opening a > bug for it (with as much info dump as possible) now that this one is fixed. We can use bug 193911 for that purpose, right/
(In reply to comment #0) > This bug is a followup to bug 175600 where the number of disk cache entries was > increased from 8192 to 16384. While that increased the number, we don't really > have any kind of measurement documented anywhere as to what this number really > should be, we know 8192 wasn't enough, but is 16384 enough? 16k is not enough either. I have 3 windows open with about 60 to 110 tabs each. Yes, i know that this certainly is not an "average" usage-pattern but i don't see any inherent scaling problems that should prevent me from doing this. So, if i simply clear my cache and then restart firefox loading all those tabs will already fill up the cache to 8k entries. If i then browse some image galleries (loads of items to cache) and then restart firefox it has reload half of the tabs again because even the cache entries _for the currently open tabs_ have been evicted from the cache due to this limitation. This causes ~5 minutes just reloading tabs from the internet because they're not in the cache anymore. Doubling the limit to 16k only gives me a bit more browsing time before restarting firefox becomes a nightmare again.
OK, so at the end of the day, what has been changed versus the "baseline"? What were the final effects? Can this bug really be considered fixed? Can someone please explain to me why there even needs to be a max # of objects? It should be controlled, and only controlled, by the cache size on a user's machine (which in turn should be dynamically based on the user's available disk space), IMHO. I can understand having certain limitations on mobile devices, etc.. but could those not be handled on a platform basis? Thank you for your time.
> OK, so at the end of the day, what has been changed versus the "baseline"? The max number of objects was changed from 16k to the smaller of "number of kilobytes of cache size" (so 50k if the cache is 50MB) and "2 million". > Can someone please explain to me why there even needs to be a max # of objects? Because there is a memory cost per object. As you would know if you had read the bug; see comment 5. Which was answering the LAST time you asked that question, in comment 4.
Ah, interesting. I guess i should have taken a look at the actual patch. Yes, the newer patch looks good, i'll just have to adjsut my cache size accordingly. Excellent.
(In reply to comment #57) > > OK, so at the end of the day, what has been changed versus the "baseline"? > > The max number of objects was changed from 16k to the smaller of "number of > kilobytes of cache size" (so 50k if the cache is 50MB) and "2 million". > > > Can someone please explain to me why there even needs to be a max # of objects? > > Because there is a memory cost per object. As you would know if you had read > the bug; see comment 5. Which was answering the LAST time you asked that > question, in comment 4. Thank you Boris. The point I was initially making, and was (in effect) implemented, is that the # of objects should never be the limiting factor in how the cache system works. I appreciate everyone's efforts in putting this into practice.
This bug will only be used in the short term until we get a new caching system. Bug 514213 as an example, this is where we should be putting most of our effort.
Whoa. I just randomly was looking at some graph server logs, and an across-the-board Tp4 win spike caught my eye, that looks like it was caused by this. Do I understand this correctly: - When this got bumped to 250MB, we saw a raw Tp4 win - We also saw a significant Tp4 CPU usage spike - We backed out because of the CPU usage spike Is that correct?
(In reply to comment #61) > Is that correct? See the latest comments in bug 193911.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: