Closed Bug 113354 Opened 23 years ago Closed 23 years ago

mork memory growth when accessing global history

Tracking

(Not tracked)

Status:

VERIFIED FIXED

Milestone:

mozilla0.9.8

People

(Reporter: dbaron, Assigned: Bienvenu)

References

Details

(Keywords: memory-footprint, memory-leak)

Attachments

(2 files)

tree showing before/after jrgm tests diffs in allocations by mork 23 years ago David Baron :dbaron: (deleted), text/plain		Details
proposed fix 23 years ago David :Bienvenu (deleted), patch	sspitzer : review+ sspitzer : superreview+	Details \| Diff \| Splinter Review

David Baron :dbaron:

Reporter

Description

•

23 years ago

I did a comparison of our memory usage before and after a run of jrgm's tests to find the sources of memory growth. The second biggest source of increase, at 830K out of the 7MB total increase (behind memory cache growth but ahead of font cache growth and wallet loading its tables) was structures allocated through orkinHeap::Alloc while looking up links in global history to determine whether they were visited. I'll attach a perl-script-generated tree showing the differences in allocations by mork. This tree is the nonzero branches of tree showing allocation size by stack, with the "after" log counted as positive numbers and the "before" log counted as negative numbers. The before and after logs contain the stack traces for the calls to malloc for all the live allocations. Note that any static functions won't have their symbols looked up correctly. I can also generated line numbers from the dll/offset data, although I'd have to regenerate the log since I did it using a build from a few days ago.

David Baron :dbaron:

Reporter

Comment 1

•

23 years ago

Attached file tree showing before/after jrgm tests diffs in allocations by mork (deleted) — Details

David Baron :dbaron:

Reporter

Updated

•

23 years ago

Keywords: footprint

Summary: mork memory growth when accessing global history → mork memory growth when accessing global history

David :Bienvenu

Assignee

Comment 2

•

23 years ago

the calls to open the database and to add rows are expected to add memory. The calls to FindRow will also allocate memory, though that memory should get freed if the history code releases the row.

David Baron :dbaron:

Reporter

Comment 3

•

23 years ago

When does the history code release the row? (And who owns the backend part of global history, anyway?)

David Baron :dbaron:

Reporter

Comment 4

•

23 years ago

Well, it looks like nsGlobalHistory::IsVisited releases the row immediately.

David Baron :dbaron:

Reporter

Comment 5

•

23 years ago

Never mind that -- I misunderstood your comment. It looks like most of the increase is due to looking up URLs to see whether they are visited. Why does that allocate memory?

David :Bienvenu

Assignee

Comment 6

•

23 years ago

Not sure who owns the backend of the history code. I would think the history code would release the row right away, but I'd have to look at it. Does this 830K include all the memory it takes to load the history db, or have you factored that out? I'm not sure why FindRow would allocate more memory than just the memory to load the db - I'd have to look at the code.

David :Bienvenu

Assignee

Comment 7

•

23 years ago

I have to confess I can't read that tree very well, at least the numbers associated with the function calls. I understand the stack traces and there very well could be some sort of bloat going on...

Cathleen

Comment 8

•

23 years ago

we should fix this within next 2 milestones. :-) setting to 0.9.8

Target Milestone: --- → mozilla0.9.8

Cathleen

Updated

•

23 years ago

Blocks: 92580

David :Bienvenu

Assignee

Comment 9

•

23 years ago

All of the allocations that aren't from loading the history db or adding pages (which I assume are the vast majority of the allocations), the only other allocations are for tokenizing the strings the history code is passing into FindRow. I doubt that those are a significant percentage of the memory allocated, but I'd need to know how big the history.dat file is, how many times we call FindRow, etc. Also, the first time you call FindRow, Mork creates an index on that column so that subsequent lookups will be fast, which I would think we'd want to keep.

Alec Flett

Comment 10

•

23 years ago

we don't even do any work - i.e. tokenizing any strings - to add it to the database, or call FindRow. We do stuff the current date/time into mork, however. About the only real work we do is notifying RDF in AddPage() my one thought is that NotifyFindAssertion should really be bailing early if there are no observers... it does some excess work that really isn't necessary... however, all of NotifyFindAssertion's work is done on the stack - the only possible allocation is from nsAutoString overflowing its buffer, but it probably only does that once or twice per call to AddPage()

Alec Flett

Comment 11

•

23 years ago

if I'm reading the attachment correctly, most of the work is from calling FindRow - 567k of allocations. My theory is this is the mork index, and its bloaty.

David :Bienvenu

Assignee

Comment 12

•

23 years ago

We call FindRow for every link in the page to see if we've visited it or not - what's happening is that all of those links are getting tokenized and stuck in the token map, by mork. If you visit a ton of unique pages with tons of unique links, this will bloat. I looked at changing mork so that FindRow didn't tokenize its input, but it's not easy - the tokens get put into a memory structure that things can't be freed from (basically, a pool), so that there's no per-object overhead. And the key to the map is the token id, so you need to tokenize to find atoms in the map. If I'm understanding what's going on inside mork correctly, I might be able to short-circuit this by checking if the string exists in the atom table at all, and if not, we can assume there's no row that could possibly have that string. The other possibility is for history to occasionally close and open the history db - this would get rid of all these temporary objects (I'm assuming that these temporary tokens aren't getting written out to the db, because no one refers to them, but that's something to check as well).

Alec Flett

Comment 13

•

23 years ago

I really like that first option (shortcircuiting) - and it seems like we can make an argument that is correct behavior: If mork MUST use a tokenized string in order to find a row in the db with a given cell value, then we can assume that each cell has to have been tokenized before hand. If each cell has been tokenized, then its value is in the table. Therefore, if the value is NOT in the table, then the value has NOT been tokenized. Since the value has not been tokenized, it is not in a cell, which means there is no such row. Q.E.D. :) I'll bet this will signifigantly reduce bloat on sites that use very unique, session-specific URLs like amazon.com and cnet.com.

David :Bienvenu

Assignee

Comment 14

•

23 years ago

thanks for the proof, Alec :-) That's what I was trying to say. I've coded it up and I'm trying it now.

David :Bienvenu

Assignee

Comment 15

•

23 years ago

Attached patch proposed fix (deleted) — Details — Splinter Review

I added a parameter to YarnToAtom which says whether to create the atom if it's missing. By default, it's true (the current behaviour), and I've changed FindRow to pass in false for the create parameter. Coloring of links seems to work fine, and I haven't seen any other problems. I can't swear that it'll fix the bloat, but it's worth trying. requesting r/sr's from dbaron and alecf.

Alec Flett

Comment 16

•

23 years ago

Comment on attachment 60333 [details] [diff] [review] proposed fix damn, that was easy. sr=alecf

(not reading, please use seth@sspitzer.org instead)

Comment 17

•

23 years ago

Comment on attachment 60333 [details] [diff] [review] proposed fix r=sspitzer, sr=alecf

Attachment #60333 - Flags: superreview+

Attachment #60333 - Flags: review+

David :Bienvenu

Assignee

Comment 18

•

23 years ago

calling findrow for a url that's not in history shouldn't bloat now.

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

sairuh (rarely reading bugmail)

Comment 19

•

22 years ago

mass-verifying claudius' Fixed bugs which haven't changed since 2001.12.31. if you think this particular bug is not fixed, please make sure of the following before reopening: a. retest with a *recent* trunk build. b. query bugzilla to see if there's an existing, open bug (new, reopened, assigned) that covers your issue. c. if this does need to be reopened, make sure there are specific steps to reproduce (unless already provided and up-to-date). thanks! [set your search string in mail to "AmbassadorKoshNaranek" to filter out these messages.]

Status: RESOLVED → VERIFIED

David Baron :dbaron:

Reporter

Updated

•

18 years ago

Keywords: mlk

BMO Automation

Updated

•

6 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.