Closed
Bug 113354
Opened 23 years ago
Closed 23 years ago
mork memory growth when accessing global history
Categories
(Core Graveyard :: History: Global, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla0.9.8
People
(Reporter: dbaron, Assigned: Bienvenu)
References
Details
(Keywords: memory-footprint, memory-leak)
Attachments
(2 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
sspitzer
:
review+
sspitzer
:
superreview+
|
Details | Diff | Splinter Review |
I did a comparison of our memory usage before and after a run of jrgm's tests to
find the sources of memory growth. The second biggest source of increase, at
830K out of the 7MB total increase (behind memory cache growth but ahead of font
cache growth and wallet loading its tables) was structures allocated through
orkinHeap::Alloc while looking up links in global history to determine whether
they were visited.
I'll attach a perl-script-generated tree showing the differences in allocations
by mork. This tree is the nonzero branches of tree showing allocation size by
stack, with the "after" log counted as positive numbers and the "before" log
counted as negative numbers. The before and after logs contain the stack traces
for the calls to malloc for all the live allocations. Note that any static
functions won't have their symbols looked up correctly. I can also generated
line numbers from the dll/offset data, although I'd have to regenerate the log
since I did it using a build from a few days ago.
Reporter | ||
Comment 1•23 years ago
|
||
Reporter | ||
Updated•23 years ago
|
Keywords: footprint
Summary: mork memory growth when accessing global history → mork memory growth when accessing global history
Assignee | ||
Comment 2•23 years ago
|
||
the calls to open the database and to add rows are expected to add memory. The
calls to FindRow will also allocate memory, though that memory should get freed
if the history code releases the row.
Reporter | ||
Comment 3•23 years ago
|
||
When does the history code release the row? (And who owns the backend part of
global history, anyway?)
Reporter | ||
Comment 4•23 years ago
|
||
Well, it looks like nsGlobalHistory::IsVisited releases the row immediately.
Reporter | ||
Comment 5•23 years ago
|
||
Never mind that -- I misunderstood your comment. It looks like most of the
increase is due to looking up URLs to see whether they are visited. Why does
that allocate memory?
Assignee | ||
Comment 6•23 years ago
|
||
Not sure who owns the backend of the history code. I would think the history
code would release the row right away, but I'd have to look at it. Does this
830K include all the memory it takes to load the history db, or have you
factored that out? I'm not sure why FindRow would allocate more memory than just
the memory to load the db - I'd have to look at the code.
Assignee | ||
Comment 7•23 years ago
|
||
I have to confess I can't read that tree very well, at least the numbers
associated with the function calls. I understand the stack traces and there very
well could be some sort of bloat going on...
we should fix this within next 2 milestones. :-)
setting to 0.9.8
Target Milestone: --- → mozilla0.9.8
Assignee | ||
Comment 9•23 years ago
|
||
All of the allocations that aren't from loading the history db or adding pages
(which I assume are the vast majority of the allocations), the only other
allocations are for tokenizing the strings the history code is passing into
FindRow. I doubt that those are a significant percentage of the memory
allocated, but I'd need to know how big the history.dat file is, how many times
we call FindRow, etc. Also, the first time you call FindRow, Mork creates an
index on that column so that subsequent lookups will be fast, which I would
think we'd want to keep.
Comment 10•23 years ago
|
||
we don't even do any work - i.e. tokenizing any strings - to add it to the
database, or call FindRow. We do stuff the current date/time into mork, however.
About the only real work we do is notifying RDF in AddPage() my one thought is
that NotifyFindAssertion should really be bailing early if there are no
observers... it does some excess work that really isn't necessary... however,
all of NotifyFindAssertion's work is done on the stack - the only possible
allocation is from nsAutoString overflowing its buffer, but it probably only
does that once or twice per call to AddPage()
Comment 11•23 years ago
|
||
if I'm reading the attachment correctly, most of the work is from calling
FindRow - 567k of allocations. My theory is this is the mork index, and its bloaty.
Assignee | ||
Comment 12•23 years ago
|
||
We call FindRow for every link in the page to see if we've visited it or not -
what's happening is that all of those links are getting tokenized and stuck in
the token map, by mork. If you visit a ton of unique pages with tons of unique
links, this will bloat. I looked at changing mork so that FindRow didn't
tokenize its input, but it's not easy - the tokens get put into a memory
structure that things can't be freed from (basically, a pool), so that there's
no per-object overhead. And the key to the map is the token id, so you need to
tokenize to find atoms in the map. If I'm understanding what's going on inside
mork correctly, I might be able to short-circuit this by checking if the string
exists in the atom table at all, and if not, we can assume there's no row that
could possibly have that string. The other possibility is for history to
occasionally close and open the history db - this would get rid of all these
temporary objects (I'm assuming that these temporary tokens aren't getting
written out to the db, because no one refers to them, but that's something to
check as well).
Comment 13•23 years ago
|
||
I really like that first option (shortcircuiting) - and it seems like we can
make an argument that is correct behavior:
If mork MUST use a tokenized string in order to find a row in the db with a
given cell value, then we can assume that each cell has to have been tokenized
before hand. If each cell has been tokenized, then its value is in the table.
Therefore, if the value is NOT in the table, then the value has NOT been
tokenized. Since the value has not been tokenized, it is not in a cell, which
means there is no such row. Q.E.D. :)
I'll bet this will signifigantly reduce bloat on sites that use very unique,
session-specific URLs like amazon.com and cnet.com.
Assignee | ||
Comment 14•23 years ago
|
||
thanks for the proof, Alec :-) That's what I was trying to say. I've coded it up
and I'm trying it now.
Assignee | ||
Comment 15•23 years ago
|
||
I added a parameter to YarnToAtom which says whether to create the atom if it's
missing. By default, it's true (the current behaviour), and I've changed
FindRow to pass in false for the create parameter. Coloring of links seems to
work fine, and I haven't seen any other problems. I can't swear that it'll fix
the bloat, but it's worth trying. requesting r/sr's from dbaron and alecf.
Comment 16•23 years ago
|
||
Comment on attachment 60333 [details] [diff] [review]
proposed fix
damn, that was easy. sr=alecf
Comment 17•23 years ago
|
||
Comment on attachment 60333 [details] [diff] [review]
proposed fix
r=sspitzer, sr=alecf
Attachment #60333 -
Flags: superreview+
Attachment #60333 -
Flags: review+
Assignee | ||
Comment 18•23 years ago
|
||
calling findrow for a url that's not in history shouldn't bloat now.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 19•22 years ago
|
||
mass-verifying claudius' Fixed bugs which haven't changed since 2001.12.31.
if you think this particular bug is not fixed, please make sure of the following
before reopening:
a. retest with a *recent* trunk build.
b. query bugzilla to see if there's an existing, open bug (new, reopened,
assigned) that covers your issue.
c. if this does need to be reopened, make sure there are specific steps to
reproduce (unless already provided and up-to-date).
thanks!
[set your search string in mail to "AmbassadorKoshNaranek" to filter out these
messages.]
Status: RESOLVED → VERIFIED
Updated•6 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•