Closed Bug 751557 Opened 13 years ago Closed 8 years ago

Adding new jobs in a Jenkins CI causes a lot of leaks and ends-up in an OOM [@ mozalloc_abort(char const*) | Abort ]

Categories

(Tech Evangelism Graveyard :: Other, defect)

Product:

Component:

Platform:

All

macOS

Type:

defect

Priority:

Not set

Severity:

major

Tracking

(Not tracked)

Status:

RESOLVED INCOMPLETE

People

(Reporter: whimboo, Unassigned)

References

(
URL
)

Details

(Keywords: hang, memory-leak, Whiteboard: [MemShrink:P2])

Crash Data

Attachments

(1 file, 5 obsolete files)

MemChaser log 13 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/plain		Details
MemChaser log 13 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/plain		Details
Detailed MemChaser log 12 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/plain		Details
about:memory?verbose 12 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/plain		Details
cc-edges.log 12 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/plain		Details
memchaser.log 10 years ago Henrik Skupin [:whimboo][⌚️UTC+2] (deleted), text/plain		Details

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Description

•

13 years ago

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:14.0) Gecko/20120502 Firefox/14.0a2 I have already seen this issue some weeks ago and talked to Olli. Since then it was kinda sporadic that it happened again. Last time it was end of last week when I haven't had time for a deeper analysis. Today it happened again and I tried to find some steps to reproduce but so far I wasn't lucky. The following information is everything I have at the moment: * It happens in my daily profile * Resident memory usage grows up to 2GB within an hour or so * GC durations go up to around 4s!!!! and make the browser unusable * It happens when I have the following Jenkins pages for Mozmill CI loaded (it's in MV VPN) http://release3.qa.mtv1.mozilla.com:8080 (daily tests) http://release8-osx-107.qa.mtv1.mozilla.com:8080/ (l10n tests) First I thought it is related to auto-refresh being enabled but it doesn't look like that this is the reason. One other assumption is that it happens when new job requests are getting added via the Jenkins API. That's what I will test next. Otherwise I will try to reduce my profile as much as possible to have a minimized profile in hand. Given the cc log I have sent to Olli there were about 32544 nsDOMStorageItem objects and over 100.000 nsXPCWrappedJS (nsISupports) objects listed. For the first issue Olli already filed bug 751529. For the latter no action has happened yet. I will leave it up to Olli because he currently analyses my cc log. I hope that I can come up with some better STR soon.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

13 years ago

Keywords: qawanted

Comment 1

•

13 years ago

Henrik, while you investigate STR, can you also check to see if: A) This is any better or worse in Firefox 15 Nightly B) This has gotten worse over time (for example, Firefox 12, 11, 10, etc)

Andrew McCreight [:mccr8]

Comment 2

•

13 years ago

CC times or GC times are 4s? Though I guess both could be bad given how much memory it is using up.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 3

•

13 years ago

Attached file MemChaser log (obsolete) (deleted) — Details

It'd GC only. But I wanted to add a memchaser log which can hopefully give some more information what is getting executed.

Andrew McCreight [:mccr8]

Comment 4

•

13 years ago

There are a number of long CCs (one is 12 seconds!), but the GCs do look a little longer. I'd guess that GC times are long because the page is using an obscene amount of memory. If you could get the contents of about:memory?verbose when you are having this problem, that might help.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 5

•

13 years ago

To some extent, the long GCs are a symptom of all the xpconnect stuff going on. We're spending ~100ms in root marking, which is very unusual. Another ~90ms is in mark_other. This means that we have to enumerate a bunch of xpconnect gray roots, and we're also spending a lot of time in gray marking. I'm not sure why the sweeping times are so long. That's been a persistent issue for us, though, so it's known.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 6

•

13 years ago

Attached file MemChaser log (obsolete) (deleted) — Details

Updated MemChaser log now with a memory consumption of about 2GB of RAM. The log shows the constant raise of the usage over time. I think that this leak happens when new jobs are getting added to the queue. I will have to test this now on my local instance. If I can prove it, I will give details instructions how to reproduce.

Attachment #620738 - Attachment is obsolete: true

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 7

•

13 years ago

The same effect can also be seen in Safari. I had the Jenkins website open on our master for a couple of days now and Safari claimed about 3.25GB of memory.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 8

•

13 years ago

For people with VPN access, simply open the following URL and keep the browser open for a couple of days. You will have to catch the time when new job requests get added, means when builds have been finished. That's early in the day PTO so you should keep the browser open. When you have the Jenkins web page open, please make sure to not change any configuration aspects. Thanks.

URL: http://qa-masterblaster.mv.mozilla.co...

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 9

•

13 years ago

Olli, would you mind giving an update based on the latest cc log I have sent to you? You have mentioned about 170.000 XPCVariant objects, is it something we should file as a new bug?

Wayne Mery (:wsmwk)

Updated

•

13 years ago

Blocks: 702495

Justin Lebar (not reading bugmail)

Updated

•

13 years ago

Assignee: nobody → english-us

Component: General → English US

Product: Core → Tech Evangelism

QA Contact: general → english-us

Whiteboard: [MemShrink]

Version: 14 Branch → unspecified

Justin Lebar (not reading bugmail)

Comment 10

•

13 years ago

Over to tech evang; given comment 7, this doesn't sound like our bug. It would be great if pages which created giant heaps didn't cause GC pauses, but we already have bugs on that.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 11

•

12 years ago

Attached file Detailed MemChaser log (obsolete) (deleted) — Details

This is a detailed log of Memchaser which I wanted to attach since ages. Sorry for the delay. It contains hopefully all the information we need to sort out this issue.

Attachment #622357 - Attachment is obsolete: true

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 12

•

12 years ago

I have had Jenkins open for an hour or so and the memory of Firefox raised from 500MB to 1.1GB. Closing the Jenkins tab caused a freeze of Firefox for about 1 minutes. As the log shows there are CC cyles of about 27s in duration but also iGC of 6s. I wonder if we can do something here as leaving it in the evangelism section only.

Keywords: hang

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 13

•

12 years ago

memchaser log doesn't give much information. GC and CC logs are needed. https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump IIRC you gave me once some logs, but I don't have them on this laptop.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 14

•

12 years ago

But Justin is right. If the web app leaks, then it leaks which will at some point slow down performance on any kind of browser. We can ofc try to make CC and GC faster, but the web app needs to be fixed.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 15

•

12 years ago

Olli, if you have VPN access to MV please open http://10.250.73.243:8080/ and enable 'auto refresh' at the top right corner. Leave the tab open for a while and you will notice the effect really soon.

Andrew McCreight [:mccr8]

Comment 16

•

12 years ago

The non-closing CC times could probably be improved, as the CC is looking at 200k GC objects and not freeing any. DOM merging could potentially help the closing CC, as there are 1.2 million ref counted nodes in the graph, but it depends on what exactly they are. I'm not sure if anything can be done for the GC. The page is just allocating a huge amount of memory and never letting it go. A large percentage of the GCs are nonincremental because the page is allocating so much (and I guess the GC is falling behind).

Andrew McCreight [:mccr8]

Comment 17

•

12 years ago

Olli, I can get a log for you, if you don't regularly VPN in, as I'll be in the MV office.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 18

•

12 years ago

As downloadable from Dropbox you have a gc/cc log of the issue with an iGC of about 200ms and a low CC duration. That should make it easier for Olli to get it processed. https://www.dropbox.com/s/820qcp5vtactep4/cc_gc_log.zip

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 19

•

12 years ago

Olli, here an updated version of the logs with the XUL cache pref removed. https://www.dropbox.com/s/820qcp5vtactep4/cc_gc_log.zip

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 20

•

12 years ago

Henrik, which version of FF are you using?

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 21

•

12 years ago

Henrik, we think an addon might be involved. Could you try this again with a fresh profile? Also, can you try to get minimal steps to reproduce this problem? I tried it myself, but I'm not getting the same kind of objects that are showing up in your CC log.

Andrew McCreight [:mccr8]

Comment 22

•

12 years ago

We're seeing some weird objects that we're not sure where they came from. Our theory is that they are weak map related, because we don't log weak maps very well right now.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 23

•

12 years ago

Right now this log has been made with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/17.0 Firefox/17.0 ID:20120930042009 I can try a fresh profile with memchaser installed only.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 24

•

12 years ago

Actually, could you turn off memchaser as well? I'm afraid it might affect things, since it uses Jetpacl.. Could you just use the CC times in the error console instead?

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 25

•

12 years ago

Here the logs with a fresh profile: https://www.dropbox.com/s/k5nu8ygv4w61yij/gg_cc_log_memchaser.zip https://www.dropbox.com/s/uikivvn8ive8yxn/gg_cc_log_plain.zip Without memchaser I'm not able to reproduce this issue. Here the logs.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 26

•

12 years ago

Haven't manage to reproduce using a trunk build. CC graph stays almost all the time really small (<500) - only the automatic page reload increases it temporarily and CC times are between 6ms and 46ms.

Justin Lebar (not reading bugmail)

Comment 27

•

12 years ago

> Without memchaser I'm not able to reproduce this issue. Updating the bug metadata to reflect that this is caused by MemChaser. This would be a significant issue if the problem turned out to be Jetpack and not simply MemChaser.

Assignee: english-us → nobody

Component: English US → Add-ons

Keywords: qawanted

Summary: Ultra long GC durations with active Jenkins web pages opened → MemChaser (perhaps simply Jetpack?) causes ultra long GC durations with active Jenkins web pages opened

Whiteboard: [MemShrink]

Nicholas Nethercote [inactive]

Comment 28

•

12 years ago

Kris says that MemChaser is using version 1.9 of the add-on SDK, which is pretty recent. So yeah, the real question here is whether this is an add-on SDK problem or MemChaser-specific?

Whiteboard: [MemShrink] → [MemShrink:P2]

Nicholas Nethercote [inactive]

Updated

•

12 years ago

Assignee: nobody → kmaglione+bmo

Kris Maglione [:kmag]

Comment 29

•

12 years ago

If what we're worried about is memory usage after http://10.250.73.243:8080/ is closed, it looks like it's mostly heap-unclassified: https://gist.github.com/3830049 I'm getting about 60MB growth in heap-unclassified with MemChaser installed, and a bit of growth elsewhere, as opposed to nothing without it installed. I'll try stripping down MemChaser to narrow down the cause when I get a chance.

Comment 30

•

12 years ago

This might be similar to Bug 740162 where I also had massive amounts of memory being used by jenkins. I didn't have any addons, though.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 31

•

12 years ago

I have uploaded a 0.5pre XPI of memchaser which I will test over the next days if it changes the behavior here: https://github.com/downloads/mozilla/memchaser/memchaser-0.5pre-201211061117.xpi

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

12 years ago

Depends on: 833783

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 32

•

12 years ago

Attached file about:memory?verbose (obsolete) (deleted) — Details

Looks like we have a lot of orphaned nodes for both the Jenkins instances: * 97,530,664 B (20.59%) ── orphan-nodes * 110,193,800 B (23.27%) ── orphan-nodes

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 33

•

12 years ago

I have to say this is with a profile with any extensions disabled.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 34

•

12 years ago

Attached file cc-edges.log (obsolete) (deleted) — Details

cc log from the same state as the about:memory dump.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 35

•

12 years ago

So this is indeed the Jenkins CI page which is leaking or keeping orphaned elements around. Closing the tab and minimizing the memory a couple of times restores the previous memory usage. The latest GG/CC logs you can find here: http://people.mozilla.com/~hskupin/downloads/ For everyone interested, here the dump of the conversation between Olli and myself on IRC: 14:29 smaug: whimboo: any chance you could create new GC/CC logs? hopefully immediately when you start to see the leak, so that the logs don't get huge 14:29 smaug will be offline soon, but could perhaps analyze the logs tomorrow 14:30 whimboo: smaug: would your addon help here 14:30 whimboo: ? 14:31 smaug: whimboo: if you know what to look at, then yes 14:31 smaug: you could check if documents are leaded 14:31 smaug: whimboo: it is possible that gc log is need too, but my tool is for cc log only 14:31 whimboo: k 14:31 whimboo: i will check right now 14:31 smaug: my local cc analyzing scripts do just the same things as about:cc 14:32 whimboo: smaug: do you want to put this xpi up somewhere? 14:32 whimboo: i always have trouble to find it 14:34 smaug: whimboo: well, I was going to put it to AMO but that was too difficult 14:34 smaug: perhaps I should get it to tree 14:34 smaug: make it a bit prettier first 14:34 whimboo: that would be great 14:34 smaug: it is very tiny addon 14:35 whimboo: smaug: k. so i will let you have a look at 14:35 whimboo: i will produce the cc now 14:38 whimboo: smaug: cc is about 1MB. that's ok? 14:38 whimboo: gc 314MB 14:38 smaug: 1MB is ok 14:38 smaug: 314GC, huh 14:39 smaug: that is less than tiny 14:39 smaug: er, "less than tiny" doesn't make sense 14:40 whimboo: heh. so i will attach the CC log to the bug 14:41 whimboo: smaug: log is up 14:42 whimboo: smaug: so good thing is with Nightly: 82.10 MB (13.19%) ── heap-unclassified 14:42 whimboo: that's way lesser as with aurora 14:45 smaug: whimboo: that is not too big cc graph 14:45 smaug: 8000 objects 14:45 smaug: sure, too big, but still... 14:45 smaug: but ok, looking 14:45 whimboo: sweet 14:45 whimboo: if it's too large I think i can give you a smaller one later 14:46 whimboo: given that I know how to produce it 14:46 smaug: so we're leaking all the XBL document 14:46 smaug: whimboo: do you have XUL cache disabled? 14:46 smaug: whimboo: it is small enough 14:47 whimboo: ups. that was a copy of an old profile 14:47 whimboo: yes it was disabled 14:47 whimboo: but should we leak in such a case? 14:47 smaug: whimboo: disabling XUL cache is not supported setup 14:47 smaug: yes, we don't care about that case 14:48 smaug: since everything behaves differently anyway 14:48 smaug: it is development only setup 14:48 whimboo: ok, so let me restart then 14:48 smaug: whimboo: we're talking about nglayout.debug.disable_xul_cache here, right :) 14:49 whimboo: i know :) 14:50 whimboo: smaug: looks like it's starting again 14:50 whimboo: 2.03 MB (01.32%) ── orphan-nodes 14:51 whimboo: smaug: shall i wait until it's a big higher? 14:52 smaug: whimboo: you could wait a bit... 14:53 whimboo: smaug: k, how long would you prefer? until it reaches about 10MB or so? 14:54 smaug: whimboo: do you have about:cc installed? 14:54 whimboo: smaug: yes 14:54 whimboo: but not sure what to do with it 14:54 smaug: could you run cc and see the size of the graph 14:55 smaug: graph size ~10000 should be enough 14:55 smaug: whimboo: run CC few times to get stable results 14:55 whimboo: smaug: where do i see that? 14:55 smaug: whimboo: in the addon 14:55 whimboo: smaug: sure.. number of objects? 14:56 whimboo: or roots, edges 14:56 smaug: whimboo: number of objects 14:56 whimboo: k, at 353 right now 14:56 smaug: whimboo: I'd expect something like 10000 14:57 smaug: though, the previous log had 8000 14:57 smaug: whimboo: if you get it to 5000 14:57 whimboo: smaug: orphaned nodes are already at 15MB 14:57 smaug: sure 14:57 smaug: JS keeping nodes alive 14:57 whimboo: so you still want to have the 10.000 limit? 14:57 smaug: but the js is black (certainly alive) so those nodes can't be released 14:58 smaug: whimboo: 5000 could be enough 14:58 whimboo: k lets wait when I reach that 14:59 whimboo: smaug: so when I run cc it always drops down to about 200 again 14:59 whimboo: seems I can't get over 500 14:59 smaug: whimboo: ahaa 15:00 smaug: whimboo: so do you actually get leak 15:00 smaug: I mean if you close the tab 15:00 smaug: now that xul cache is enabled 15:01 whimboo: smaug: shall I close the tab and run cc? 15:01 smaug: whimboo: yes, and check what about:memory says 15:02 whimboo: smaug: hm, i think the addon initializes wrongly 15:02 whimboo: when I tried to run 'Run cycle collector' it did nothing 15:02 whimboo: now after closing the tab I get 220000 objects 15:03 smaug: that is very possibe 15:03 smaug: whimboo: run cc few times 15:03 whimboo: oh, right 15:03 whimboo: yes. it cleans up everything 15:03 whimboo: back at 200 objects 15:03 smaug: so it was garbage that we collected 15:03 whimboo: and about:memory doesn't list the page anymore 15:04 whimboo: that means it's really the website which is leaking memory 15:04 smaug: and memory usage is back to normal levels? 15:04 mdas has joined (mdas@50AE257C.D30B51A1.412CF160.IP) 15:04 whimboo: just a sec. will repeat iwth the other tab 15:04 smaug: well, "leaking", but yes, the website keeps objects alive 15:05 whimboo: smaug: err, now I'm at 1GB 15:06 whimboo: is that because of your addon? 15:06 whimboo: seems like 15:06 whimboo: 391.25 MB (61.12%) -- top(about:cc, id=28)/active/window(about:cc) 15:07 whimboo: smaug: wohoo. back to normal.. 240MB 15:07 smaug: whimboo: well, about:cc may use a lot of memory if the graph is huge 15:08 whimboo: smaug: so if I can help to analyze which js is holding that please tell me the steps 15:08 smaug: thinking... 15:09 smaug: whimboo: perhaps the best is to get cc and gc logs 15:09 smaug: before they are too huge 15:09 smaug: especially gc log is needed 15:09 smaug: since it is probably page keeping some js alive which ends up keeping nodes alive ... 15:10 whimboo: ok, will give those in a bit 15:11 whimboo: smaug: is 50MB for GC fine? 15:12 smaug: whimboo: uh, that is quite big one 15:12 whimboo: gna. let me restart then 15:13 smaug: whimboo: so, I need to go soon, but could you record this conversation to the bug, so that it is clear that it is the page which is leaking 15:13 whimboo: smaug: i cna't get it under 40MB 15:14 smaug: whimboo: put it to some server and I'll download it from there? 15:14 whimboo: sure 15:14 whimboo: thanks 15:14 whimboo: will comemnt in the bug 15:14 smaug: and put cc log too 15:14 smaug: thanks 15:14 whimboo: sure

Component: Add-ons → Other

Summary: MemChaser (perhaps simply Jetpack?) causes ultra long GC durations with active Jenkins web pages opened → Jenkins CI web page causes a lot of leaks when jobs are running

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 36

•

12 years ago

Comment on attachment 706361 [details] cc-edges.log This was collected with XUL cache disabled.

Attachment #706361 - Attachment is obsolete: true

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 37

•

12 years ago

Updated URLs for our own pages: http://10.250.73.243:8080/ http://mm-ci-master.qa.scl3.mozilla.com:8080/ Open one of them and wait until jobs are running. The number of orphaned nodes will increase constantly and reach 10MB after a couple of minutes.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

12 years ago

No longer depends on: 833783

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

12 years ago

Depends on: 600307
No longer depends on: 751529

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 38

•

12 years ago

Not sure why I haven't commented on this bug with the location of the CC and GC log I have uploaded to my people account. Olli, whenever you have time please have a look at those. http://people.mozilla.com/~hskupin/downloads/cc-edges.88911.log http://people.mozilla.com/~hskupin/downloads/gc-edges.88911.log

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 39

•

12 years ago

Requesting info-needed from Olli for my latest cc/gc logs for his analysis. I hope you will have a bit of time soon. Thanks.

Flags: needinfo?(bugs)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 40

•

12 years ago

Olli, can find a bit of time to have a look at?

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 41

•

12 years ago

An old version of Firefox which was 15.0 I got to crash in js::gc::MarkObjectUnbarried after adding hundreds of job requests to the Jenkins instance. I have now updated the version of Firefox to the latest release and will check that again after Eastern.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 42

•

12 years ago

Seems like Kris doesn't work on it. Back into the pool.

Assignee: kmaglione+bmo → other

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 43

•

12 years ago

I got the crash again today with the latest release of Firefox: bp-6e5c9d5a-08dd-4a6b-95ed-6c6182130404 As the crash info shows Firefox crashes due to an OOM. Is there something we can prevent to not let this happen?

Crash Signature: [@ mozalloc_abort(char const*) | Abort ]

Keywords: mlk

Summary: Jenkins CI web page causes a lot of leaks when jobs are running → Adding new jobs in a Jenkins CI causes a lot of leaks and ends-up in an OOM [@ mozalloc_abort(char const*) | Abort ]

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 44

•

11 years ago

Sorry about not dealing with this. Henrik, could you re-test and report whether there are any changes.

Flags: needinfo?(bugs)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 45

•

11 years ago

Yes, this is still an issue with Aurora. I know that information about Nightly would be better, but I can't deliver those today. If you need those I can check tomorrow. Otherwise I would really appreciate if you could test it yourself.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

11 years ago

Flags: needinfo?(bugs)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 46

•

11 years ago

I left our Jenkins page open for half a day with latest Aurora (26.0a2) and now I have a memory usage of 1.6GB whereby the referred web page already uses 650MB! 480MB in orphaned nodes which can even not get released via GC. I would love if we could get some tracking on this bug. It's waiting for months meanwhile. :( 1,004.78 MB (63.90%) -- window-objects │ ├────642.13 MB (40.83%) -- top(http://mm-ci-master.qa.scl3.mozilla.com:8080/, id=83) │ │ ├──637.28 MB (40.53%) -- active │ │ │ ├──636.97 MB (40.51%) -- window(http://mm-ci-master.qa.scl3.mozilla.com:8080/) │ │ │ │ ├──483.43 MB (30.74%) -- dom │ │ │ │ │ ├──482.45 MB (30.68%) ── orphan-nodes │ │ │ │ │ └────0.98 MB (00.06%) ++ (4 tiny) │ │ │ │ ├──145.40 MB (09.25%) -- js-compartment(http://mm-ci-master.qa.scl3.mozilla.com:8080/) │ │ │ │ │ ├──130.26 MB (08.28%) -- gc-heap │ │ │ │ │ │ ├──129.73 MB (08.25%) -- objects │ │ │ │ │ │ │ ├───79.33 MB (05.04%) ── ordinary │ │ │ │ │ │ │ ├───34.48 MB (02.19%) ── function │ │ │ │ │ │ │ └───15.92 MB (01.01%) ── dense-array │ │ │ │ │ │ └────0.54 MB (00.03%) ++ (3 tiny) │ │ │ │ │ └───15.13 MB (00.96%) ++ (5 tiny) │ │ │ │ └────8.15 MB (00.52%) ++ (3 tiny) │ │ │ └────0.31 MB (00.02%) ++ window(about:blank) │ │ └────4.85 MB (00.31%) ++ js-zone(0x7fbb3c6f4800)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 47

•

11 years ago

Yeah, I know. I or someone should find time to find the memory leak in Jenkins. (Of course it would be nice if the Jenkins devs would do it.)

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

11 years ago

Flags: needinfo?(bugs)

Justin Lebar (not reading bugmail)

Comment 48

•

11 years ago

Henrik, have you tried contacting the Jenkins developers? This pretty clearly is a leak in their page.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 49

•

11 years ago

Sure, I even talked with the lead developer. But sadly they never followed-up on it and on IRC I didn't receive helpful answers. But lets wait here for a moment. At the moment we are going to upgrade our local Jenkins instance to the latest LTS (see https://github.com/mozilla/mozmill-ci/pull/260). That means nearly 1 year passed since we did the last upgrade. Lets see if the new version is better. I will report back in the next couple of days.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 50

•

11 years ago

So we have updated to the latest Jenkins LTS version today in our production environment, but the issue is still persistent. With that in mind I will try to get this reproduced with a local Jenkins installation, and plugins disabled. One of those might also cause this problem.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 51

•

11 years ago

Not sure where all this unclassified memory comes from but I suspect it is from this bug. Is there anything which has been regressed here in terms of collecting orphaned nodes? I'm not sure what I should do to get more details here. 2,202.88 MB (100.0%) -- explicit ├──1,035.84 MB (47.02%) ── heap-unclassified

Andrew McCreight [:mccr8]

Comment 52

•

11 years ago

Orphaned nodes held alive by JS should be tracked in about:memory now.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 53

•

11 years ago

Andrew, what do you mean by now? Which version of Firefox? I'm currently running 28.0a2 and the memory situation is kinda bad as shown above.

Andrew McCreight [:mccr8]

Comment 54

•

11 years ago

I'm suggesting that the heap-unclassified probably isn't orphan nodes, but it could be. The only way to figure out what heap-unclassified is, is to run a DMD build. https://wiki.mozilla.org/Performance/MemShrink/DMD#Build

Nicholas Nethercote [inactive]

Comment 55

•

11 years ago

And orphan nodes get reported -- look for "orphan-nodes" in about:memory. It's possible that some are being missed, but in general it's not immediately obvious that high heap-unclassified implies orphan nodes. As mccr8 says, DMD is the way to analyze this.

Wayne Mery (:wsmwk)

Updated

•

11 years ago

Blocks: 791695

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 56

•

11 years ago

We need some DMD analysis here, I think. But I could take a look at fresh cc logs - both "concise" and "verbose" (when created using about:memory). If possible, create logs with a clean profile (no addons).

Flags: needinfo?(hskupin)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 57

•

10 years ago

Someone who can reproduce this needs to do some DMD analysis here.

Andrew McCreight [:mccr8]

Updated

•

10 years ago

Flags: needinfo?(bugs)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 58

•

10 years ago

Early last week we updated our Jenkins installation to the most recent LTS version. I can now have a look if something has been changed. So I setup a new profile with only memchaser being installed. I will watch the memory usage through the day and will report back with logs as requested. I will also try to get a DMD build working.

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 59

•

10 years ago

Attached file memchaser.log (deleted) — Details

Ok, so this is still a problem with the latest Jenkins. Memory usage of a fresh profile with only memchaser installed went up from about 150MB to 1900MB. Firefox was barely reacting at this time. So for now I have the memchaser logs. The other requested information I will provide once I'm in a state again when Firefox uses about 1GB of memory.

Attachment #666566 - Attachment is obsolete: true

Attachment #706358 - Attachment is obsolete: true

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 60

•

10 years ago

(In reply to Olli Pettay [:smaug] from comment #56) > But I could take a look at fresh cc logs - both "concise" and "verbose" > (when created using about:memory). Ok, so I saved a concise log: https://www.dropbox.com/s/2o3xhyttkit5wcw/concise.zip Sadly all the memory has been released right after, so for a verbose log I will have to wait again until the memory usage is at a specific level. As it looks like it's really a memory leak for that site.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 61

•

10 years ago

Here is the verbose cc log: https://www.dropbox.com/s/7y9w1cpiq03kwf0/verbose.zip Olli, can you please have a look at?

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 62

•

10 years ago

I did look at the logs, and that was discussed on IRC http://logs.glob.uno/?c=mozilla%23developers&s=26%20Jun%202014&e=26%20Jun%202014#c995967

Flags: needinfo?(bugs)

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 63

•

10 years ago

I still see this problem with current Nightly builds. The important fact here is now that even when I close the tab, the memory doesn't go down, but raises up from about 2GB to 5GB! Then Firefox has 100% CPU load, and memory goes up and down but the application is no longer usable. It completely hangs.

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Updated

•

10 years ago

URL: http://qa-masterblaster.mv.mozilla.co... → http://mm-ci-production.qa.scl3.mozil...

Syd Polk :sydpolk

Comment 64

•

10 years ago

I have seen this as well.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 65

•

10 years ago

Could you perhaps upload new GC/CC logs?

Nobody; OK to take it and work on it

Updated

•

10 years ago

Product: Tech Evangelism → Tech Evangelism Graveyard

Updated

•

9 years ago

Crash Signature: [@ mozalloc_abort(char const*) | Abort ] → [@ mozalloc_abort(char const*) | Abort ] [@ mozalloc_abort | Abort ]

Henrik Skupin [:whimboo][⌚️UTC+2]

Reporter

Comment 66

•

8 years ago

We haven't made any progress here so I doubt anything will happen in the future. And as such we are also going to stop using Jenkins within a most likely 3 months period. So by that we won't be able to reproduce this behavior anymore, whereby I'm even not sure if we can still reproduce. So I'm closing this bug as incomplete for now.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

You need to log in before you can comment on or make changes to this bug.