21364 - [leak] Leak another 10K just viewing a pair of messages (each time)

Reporter

Description

•

25 years ago

Using the 12/9 build.... Open mail Open inbox select a message view the message select next message view message select first message ...etc You will leak amounts of memory as high as 1M, but rarely less thatn 200K with each transition. The double transition (back to original message) seems to consistently cost in excess of 1M total. The messages I'm viewing are each about 5 screen in lengeth (not too big, but not tiny). My inbox has 4000 messages, but I don't expect this to have been critical.

Chris Waterson

Comment 1

•

25 years ago

This is probably XUL's fault. XUL has become quite leaky in the last week.

Jim Roskind

Reporter

Comment 2

•

25 years ago

I just tried this with small messages, and the leak was a mere 20-40K per cycle, so I suspect the size of the message is significant.

lchiang

Updated

•

25 years ago

QA Contact: lchiang → suresh

Chris Waterson

Updated

•

25 years ago

Assignee: phil → waterson

Chris Waterson

Comment 3

•

25 years ago

It turns out we seem to leak a nsDocLoaderImpl (which holds on to the XUL document) because of a circular reference between the docloader and the load group it creates. I'm gonna take a look at this, with mscott's help.

leger

Updated

•

25 years ago

Whiteboard: [PDT+]

leger

Comment 4

•

25 years ago

Putting on the PDT+ radar.

Chris Waterson

Updated

•

25 years ago

Assignee: waterson → mscott

Chris Waterson

Comment 5

•

25 years ago

mscott wrastled this from me.

Scott MacGregor

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Scott MacGregor

Assignee

Comment 6

•

25 years ago

so right now I am not looking at mailnews. I'm trying to figure out why starting up and shutting down the browser apparently leaks a doc loader and load group as waterson mentioned. I've figured out why that leaks happens. I'm trying to come with some code to fix it right now. Some people try to create a doc loader instance as a service. The first instance is stored in a global variable and has no ties to a web shell. Keep in mind that doc loaders have a circular reference with a load group. So a doc loader's Destroy method needs to be called in order to break the circular reference. This happens naturally when a doc loader is created in conjunction with a webshell (because the webshell always calls it's doc loader's destroy method). But this doesn't happen when someone creates a "global" doc loader by calling into the service manager. So the circular reference is never broken and we leak the doc loader and load group. Adding rpotts to the cc list. My fix is going to involve force people who want to use the doc loader as a service to use a different object than those that use the doc loader as part of the web shell. You'll get the doc loader service, then ask it for the real doc loader. Then when the doc loader service is destroyed, it will call the destroy method on the global doc loader it contains.

rpotts (gone)

Comment 7

•

25 years ago

hey scott, If the docloader is leaking because of the circular link to the loadgroup, then let me break that link with a nsWeakPtr... It used to be done this way, but for a short time we were proxying the notifications from the load group - and the weak ptr was lost. BTW. leaking the global docloader service should not be the cause this leak! This docloader has no associated webshell and *only* contains other docloaders (no channels). Since all of the other docloaders (and loadgroups) are cleaned up this one will be empty when it is leaked... -- rick

Scott MacGregor

Assignee

Comment 8

•

25 years ago

Hey Rick, So I missed this message of yours. I just went ahead and wrote a small snippet of code to make a doc loader service object which olds onto the global doc loader. This fixes the leak, but of course as you pointed out, the leak is quite small and not the source of the 400K leaks we are seeing on tinderbox which is what waterson and I were investigating.

Scott MacGregor

Assignee

Comment 9

•

25 years ago

waterson, I hate to have to think about bouncing this back to you but I've fixed the doc loader / load group problem in my tree and it doesn't effect the bloat at all. some of the bigger ticket items in the leak log are xul attributes (, atomimpl, 2 xul documents...

Chris Waterson

Comment 10

•

25 years ago

No problem. if nsDocLoaderImpl doesn't leak, but nsXULDocument still leaks, give it to me.

Scott MacGregor

Assignee

Comment 11

•

25 years ago

Update: The changes I have in my tree to fix the doc loader leak + some changes to fix webshells that were leaked by the global window have helped the mailnews leak story. That is, these changes fix the webshells that were leaked per message viewed. i.e. if you used to view 4 messages, we would leak 4 webshells. However, this change doesn't have as dramatic as an effect on the leak count as you might think because the webshells that were getting leaked before were being destroyed. i.e. there destroy method was being called so they weren't leaking anything they were holding on to. Just the actual webshell structure itself. And you won't notice that gain until you close the mail app too. That being said. When I view messages using todays bits, I'm not seeing jumps of 200K between message views. Sometimes it jumps up, sometimes it goes down depending on how big the message size is. But I'm not seeing increasing amounts of memory useage. Now that being said, the part that I think is still potentially a dogfood fix is the fact that starting up mail OR the browser leaks the xul document associated with the window for that application. I think this is contributing to a large about of the 400K we leak when you start up and shut down the browser (as seen on tinderbox)

Scott MacGregor

Assignee

Comment 12

•

25 years ago

More info: I do see my memory useage jump about 200K when I display a message but as soon as layout is done, the 200K is returned. i.e. if I view 5 or 6 messages that are about the same size, the mem useage when I view the first one vs when I view the last one is about the same. The memory is being returned. But i do see 200K spikes when laying out each message. Also, I've been trying to look into the XUL Document leak for mail and the browser. The bloat log swears it's getting leaked. Yet when I ask it for the serial number of the leaked document it won't give me one! So I can't get a leak tree for the xul document...arggh...

Chris Waterson

Comment 13

•

25 years ago

I'll look for that also. Do you have patches in your tree that I should pick up? If so, spank 'em into the bug.

Scott MacGregor

Assignee

Comment 14

•

25 years ago

Chris, I take it back. I am getting logs of nsXULDocument. I was incorrectly setting the class to be nsWebShell for ref count data and of course we weren't leaking the webshell. =) So I'm getting a valid ref counting graph for the xul document. There's one xtra ref count in there. I'm trying to track it down right now but feel free to look too. I'm attaching a patch which includes changes to globalWindowImpl, webshell, extensions.

Scott MacGregor

Assignee

Comment 15

•

25 years ago

Attached patch current patch showing leak fixes in my tree for webshell and global window stuff (deleted) — Details — Splinter Review

Scott MacGregor

Assignee

Updated

•

25 years ago

Summary: [dogfood][leak] Clicking between adjacent messages leaks 200K-1M → [dogfood][leak] Leak 400K when starting up and shutting down the app

Scott MacGregor

Assignee

Comment 16

•

25 years ago

I'm going to change the summary of this bug to represent what we are working on: starting up and shutting down the browser leaks over 400K according to the leak tools on tinderbox. The patches I've already posted fix any webshell leaks related to viewing mailnews messages. In addition to these patches, I'm slowly chipping away at leaks for starting up and shutting down the browser. I found some nasty leaks in the cookie code and the wallet code earlier and I now have my linux box down to the point where it is leaking 350K instead of 425K. But I'm just chipping around the edges until we can solve one of the big leaks which I think is the XULDocument.

Scott MacGregor

Assignee

Comment 17

•

25 years ago

load balancing to waterson who will probably spot the xul document leak faster than me anyhow. I also have leak fixes that help bring this number down. I'll continue to try to help chris as well.

Scott MacGregor

Assignee

Updated

•

25 years ago

Assignee: mscott → waterson

Status: ASSIGNED → NEW

Scott MacGregor

Assignee

Comment 18

•

25 years ago

re-assigning for real this time.

lchiang

Updated

•

25 years ago

QA Contact: suresh → gerardok

Chris Waterson

Updated

•

25 years ago

Depends on: 21661

Chris Waterson

Updated

•

25 years ago

Depends on: 21643

Chris Waterson

Updated

•

25 years ago

Status: NEW → ASSIGNED

Target Milestone: M12

Chris Waterson

Updated

•

25 years ago

Depends on: 21668

Jim Roskind

Reporter

Updated

•

25 years ago

Summary: [dogfood][leak] Leak 400K when starting up and shutting down the app → [dogfood][leak] Leak another 400K just viewing a pair of messages (each time)

Jim Roskind

Reporter

Comment 19

•

25 years ago

The bug is not a "startup and shutdown" bug. Sorry the focus changed by someone. I'm changing the title back to the original which involves about 1M plus leakage from just viewing two distinct messages (even as you view them again and again). Two message views add up to a 1 meg leak. That was the bug... not a startup/shutdown issue.

Scott MacGregor

Assignee

Comment 20

•

25 years ago

I'm sorry Jar. I was the one who morphed the subject title. Per my comments way up there somewhere, I'm not seeing a leak between viewing messages. Yes we use up 200K to display it but I see that memory getting freed. I suppose I should have marked this bug as works for me and we should have used a new one to track the large start up and shutdown leaks. One note about my inability to reproduce your problem: I now have lots of leak fixes in my tree for the startup / shutdown problem including changes that prevent us from leaking webshells when viewing messages)

Jim Roskind

Reporter

Comment 21

•

25 years ago

I just tried this on the 12/13 build number 15 on windows (today's build). Without working too hard, (my second pair of messages I chose), I quickly found a pair that leaks about 350K each time I cycle back and forth through them. I can do this again and again, with ever increasing total leak (not just startup/stutdown). If there is a startup/shutdown bug, that should be filed... but tihs should really look at the problem. Please ring me up or come by to see (if my description is still not clear, and you can't reproduce). Thanks, Jim

Chris Waterson

Comment 22

•

25 years ago

jar: due to the nature of object ownership, if a XUL document is leaked, it will leak all the things that it owns.

Jim Roskind

Reporter

Comment 23

•

25 years ago

I understand that if a XUL object is leaked, then all related objects go along with it (leak) in a reference-count system. The critical thing about this bug is that in the course of an oft repeated user activity (reading email) that a leak of several hundred K apppears for every couple of messages is a problem. IF there was a 400K leak that related to startup and shutdown *ONLY*, then it would not be a PDT+ bug ('cause it would not impact usability, since a user doesn't startup and shut down tons of times in each session.... especially mail... and especially with the startup/shutdown time performance in mail today). I understand that the leaked object(s) will typically be small... I'm just after finding the leaked object that transitively induces this mother of a leak. This bug will be gone when that object (or the one or two big helpers) is isolated, and repaired.

Chris Waterson

Comment 24

•

25 years ago

Every mail message is a XUL document.

Chris Waterson

Comment 25

•

25 years ago

jar: sorry, re-reading the comments in this bug, my previous comment was a complete tautology ("an object leaks the objects it owns" -- duh). What I *meant* to say was that the XUL document is a central object that has a very high ownership fan-out. It leaks the document content model and script event listeners, content model elements hold on to RDF datasources, etc. Fixing the startup leak of the XUL documents on startup will be necessary (although possibly not sufficient) to fix the leak of the mail message document.

Scott MacGregor

Assignee

Comment 26

•

25 years ago

chris, here are two small leak fixes that I also have in my tree of the global context. I'm attaching them here just so we are on the same page. Note: these fix ref counting leaks on the global context but even with these changes, it is still being leaked.

Scott MacGregor

Assignee

Comment 27

•

25 years ago

Attached patch fixes two leaks of the global context (deleted) — Details — Splinter Review

Chris Waterson

Comment 28

•

25 years ago

Attached patch more fixes. includes fixes for dependent bugs. (deleted) — Details — Splinter Review

Jim Roskind

Reporter

Comment 29

•

25 years ago

Rpotts confirmed that he could not reproduce this in his non-commercial build. He also confirmed that he could reproduce this in his comercial build. mscott: When you said you could not reproduce, were you using the comercial build? Please try it there... in my demo for rpotts, we were losing about 600K per pair of messages using the 12/13 build pulled from sweetlou. The obvious question is why are we leaking so differently on comercial build??

Chris Waterson

Comment 30

•

25 years ago

In a tree without mods, I see a 80-100K leak per message in the mozilla build. I see a 400-500Kb leak in the commercial build. I'm going to build a commercial tree with these mods and see what happens.

Scott MacGregor

Assignee

Comment 31

•

25 years ago

Some AIM stuff was recently turned on when displaying messages in the commercial build. I bet my bottom dollar it's coming from there...That would explain the difference.

Scott MacGregor

Assignee

Comment 32

•

25 years ago

Jar, I think I may have forced your problem to be fixed incidently today. Please see the PDT+ bug I filed on Alex Musil this afternoon: Bug #21710. Turns out they had added some stuff recently (within the last week) that was causing an address book to get opened for each email address in the email message. And I've been told that the address book is very very leaky. Alex fixed this bug for me and as a result we only open an address book up once per session instead of every time you view a message. I believe this was causing the leak you were seeing since it was only occurring in the commercial build. I'm running a commercial debug build with alex's changes for that bug and I'm seeing the same minimal leak behavior I reported to you earlier on the mozilla build when viewing messages. I belive we can mark this PDT bug fixed due to the changes in Alex's code tonight. Tell you what, why don't I wait and you can try out tomorrow's bits .if you aren't seeing the better leak behavior that I'm seeing in the commercial build we'll take a look again.

Chris Waterson

Comment 33

•

25 years ago

mscott: it looks like you are right. I just updated my tree, and am no longer seeing a large leak each time a new message is opened. jar: i'll let you mark this as fixed when the builds come out tomorrow.

Chris Waterson

Comment 34

•

25 years ago

It appears to be windows-specific. I cannot reproduce the leak in a commercial Linux build.

chris hofmann

Updated

•

25 years ago

Target Milestone: M12 → M13

chris hofmann

Comment 35

•

25 years ago

lets check this out to see if it looks good in 12/15 builds. if not fix in the first m13 carpool landings?

Jim Roskind

Reporter

Comment 36

•

25 years ago

Both Waterson and I can still see the leak in the 12/15 morning build :-(. It is not gone :-/.

Chris Waterson

Updated

•

25 years ago

Priority: P3 → P1

Chris Waterson

Comment 37

•

25 years ago

jar: in a recent private email, you said that this seems to be down to l.t. 100K per message now. should we declare victory, or continue work on this? Or did I mis-read your email? ;-)

Jim Roskind

Reporter

Comment 38

•

25 years ago

I was either sleepy, or miscommunicated (and I can't find the messag in my outbox :-/ ). Anyway... the problem is still visible on the Netscape Window's build, but not visible on the mozilla windows build, and not visible on the Linux Netscape build. Conclusion: This is a Windows only, Netscape version only, leak. Attached is a file that you can use to demonstrate the leak. Please just cycle back and forth between the two messages, and you should see about a 380K leak per cycle.

Jim Roskind

Reporter

Comment 39

•

25 years ago

Attached file Leak demo (cycle between messages in this local box to see leak) (deleted) — Details

Jim Roskind

Reporter

Updated

•

25 years ago

Blocks: 22176

Jim Roskind

Reporter

Comment 40

•

25 years ago

Attached file This is a demo mailbox file, with two messages, that leaks 600-700K per viewing cycle (deleted) — Details

Jim Roskind

Reporter

Comment 41

•

25 years ago

Second try... this time I attached the actual mailbox, rather than the mailbox index. I tried a couple of messages. I can't tell if the leak is larger because of the number of folks on the To/CC list, or because of the complexity of the message. I could try to debug this down by hand editing the file so that I could tell what caused the larger leak. I'll do a little of this and comment back in a moment.

Jim Roskind

Reporter

Comment 42

•

25 years ago

I tried reducing the number of folks on the CC list, and the leak persisted. It would appear that the size of the leak relates to the size and/or complexity of the message, not the email address count in the headers. Hope that helps, Jim p.s., This was tested on the M12 candidate for the comercial build, built 12/20

Chris Waterson

Comment 43

•

25 years ago

I see 320Kb per message on Linux; which is consistent with the 700Kb per "cycle", as you were defining it, Jim.

Chris Waterson

Comment 44

•

25 years ago

Ok, I've just done the Dumb Thing and instrumented PR_Malloc() to dump a stack trace when asked to allocate a block >8Kb. Turns out that message parsing does a -lot- of this. Specifically, we create ten to twenty 16-32Kb blocks (!) while parsing the "data:" URL that (presumably) is holding the message data. This is done to copy string data around. If leaked (or not re-usable because of subseqent re-poritioning to smaller blocks), this would account for jar's 300Kb leak per message. See http://lxr.mozilla.org/mozilla/source/netwerk/protocol/data/src/nsDataChannel.cpp#118 Specifically, the mURL->GetSpec() is creating 16-32Kb string copies! Not only does this seem to be -tremendously- inefficient, my guess that, even if it isn't leaking memory, it is seriously fragmenting memory. Can we think of a better way to do this? Maybe adding a [noscript] GetImmutableSpec() to nsIURI? (warren: this is your chance to say "I told you so.") Or using a mechanism other than a "data:" URL to transfer message data?

Chris Waterson

Comment 45

•

25 years ago

add bienvenu to cc list, who is another leak hero on the mail team.

Scott MacGregor

Assignee

Comment 46

•

25 years ago

Chris: i wouldn't spend too much time on this leak if it looks to be related to the message body being placed in the data url. I've been spending the last few days cooking up a new more efficient way to display messages that gets rid of the data url. It's much more efficient and is improving message display times by as much as 4 times on slower machines! Plus, it looks to be much more memory efficient as well. I'm planning on landing it when I get back from the holidays after New Years.

David :Bienvenu

Comment 47

•

25 years ago

I will purify with Scott's new message loading stuff tomorrow and see what happens.

Chris Waterson

Updated

•

25 years ago

Assignee: waterson → mscott

Status: ASSIGNED → NEW

Chris Waterson

Comment 48

•

25 years ago

mscott: giving this one to you. when you land your mail reading changes, test it, see if we still leak. if we do, then you can give it back to me. if we don't then close it!

Scott MacGregor

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Scott MacGregor

Assignee

Comment 49

•

25 years ago

I'm not seeing the large message leaks after landing my new souped up message display stuff. However, I don't consider myself to be a good test case 'cause I had such a hard time getting the large leak to happen before too. Jar, are you still seeing large leaks?

Jim Roskind

Reporter

Updated

•

25 years ago

Summary: [dogfood][leak] Leak another 400K just viewing a pair of messages (each time) → [leak] Leak another 400K just viewing a pair of messages (each time)

Whiteboard: [PDT+]

Jim Roskind

Reporter

Comment 50

•

25 years ago

The bug is now very small, and certainly not worthy of a PDT+ status (or even a dogfood request). I'm removing both comments. I think the leak is now under 10K per toggle, which is still a bug... but nothing critical at this point. IMO, you should mark this at least M15. Thanks to all, Jim

Scott MacGregor

Assignee

Updated

•

25 years ago

Target Milestone: M13 → M15

Scott MacGregor

Assignee

Comment 51

•

25 years ago

Thanks for the help jar. Bouncing out past beta 1 for the work involving tracking down the 10K or so we are still leaking when displaying a message.

David Baron :dbaron:

Updated

•

25 years ago

Keywords: mlk

Scott MacGregor

Assignee

Comment 52

•

25 years ago

updating the summary to reflect that the last known leak cnt was under 10K for some messages. I'm not actually seeing any signficant leakage displaying messages according to the refcounting logs and by monitoring my memory useage on windows when displaying messages. I'm just going to go ahead and mark this fixed. I will of course continue to keep an eye on leaks displaying messages as we go forward.

Summary: [leak] Leak another 400K just viewing a pair of messages (each time) → [leak] Leak another 10K just viewing a pair of messages (each time)

Scott MacGregor

Assignee

Comment 53

•

25 years ago

fixed.

Status: ASSIGNED → RESOLVED

Closed: 25 years ago

Resolution: --- → FIXED

Jim Roskind

Reporter

Updated

•

25 years ago

No longer blocks: 22176

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: MailNews → Core

Nobody; OK to take it and work on it

Updated

•

16 years ago

Product: Core → MailNews Core

current patch showing leak fixes in my tree for webshell and global window stuff 25 years ago Scott MacGregor (deleted), patch		Details \| Diff \| Splinter Review
fixes two leaks of the global context 25 years ago Scott MacGregor (deleted), patch		Details \| Diff \| Splinter Review
more fixes. includes fixes for dependent bugs. 25 years ago Chris Waterson (deleted), patch		Details \| Diff \| Splinter Review
Leak demo (cycle between messages in this local box to see leak) 25 years ago Jim Roskind (deleted), application/octet-stream		Details
This is a demo mailbox file, with two messages, that leaks 600-700K per viewing cycle 25 years ago Jim Roskind (deleted), application/octet-stream		Details