Closed Bug 616310 Opened 14 years ago Closed 14 years ago

JM: reduce fragmentation in ExecutableAllocator

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

blocking2.0

---

beta9+

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 2 obsolete files)

patch (against TM 58191:f20d22c4df47) 14 years ago Nicholas Nethercote [inactive] (deleted), patch		Details \| Diff \| Splinter Review
patch v2 (against TM 58218:0b53fd11e374) 14 years ago Nicholas Nethercote [inactive] (deleted), patch		Details \| Diff \| Splinter Review
patch v3 (against TM 58218:0b53fd11e374) 14 years ago Nicholas Nethercote [inactive] (deleted), patch	jseward : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

14 years ago

Bug 615199 comment 36 indicates that JM's code allocation has a reasonable amount of fragmentation, ie. in one example 74.9MB of code took up 88.4MB of space.  And that's on Linux, the situation is probably worse on Windows because the minimum allocation size on Windows is larger.

I have some ideas to improve this.

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

14 years ago

Attached patch patch (against TM 58191:f20d22c4df47) (obsolete) (deleted) — Details — Splinter Review

In what follows, I'll talk about two sizes of things: chunks and pages.
- Linux: chunk = 16KB, page = 4KB
- Windows: chunk = 64KB, page = 64KB (effectively, because VirtualAlloc
  always 64KB-aligns, apparently)

Sources of waste:
- "lwaste":  When we do a large allocation (> chunk size) we round the size
  up to the nearest page size.

- When we do a non-large allocation which doesn't fit in the
  small pool, we allocate a new chunk-sized pool.  

  - "swaste":  If the space left in the pool is greater than that left in
    the small pool, we replace the existing small pool with the new pool.
    This wastes the space in the existing small pool.

  - "nwaste":  Otherwise, the excess space in the new pool is wasted.

- "owaste": if a small pool is only half-filled and we stop allocating,
  there's wasted space at the end.  This is the least important, because
  it could be used later on if we start allocating again.
  
Possible changes:
- Keeping N small pools around instead of 1 will help with "swaste" and
  "nwaste".

- Changing the chunk and page size can help with "lwaste".

Both of these potentially increase "owaste".

I implemented N small pools and did some measurements, outlined below.  My
conclusion is that we should use 4 small pools and increase the chunk size
to 64KB on all platforms.  On Linux that reduces the over-allocation factor
from 1.12x to 1.03x.  On Windows it reduces from 1.19x to 1.09x.  In both
cases it saves ~400KB per tab.  It's a shame the windows "page" size is so
big, that hurts some.

One thing I haven't managed to do is see changes using 'top'.  Not sure why, though there is quite some variability -- subsequent runs often vary by 10s of MBs.

(All measurements are on Linux64, opening 40 tabs, each containing a random
cad-comic.com comic.)

-----------------------------------------------------------------------------
Linux current (chunk = 16KB, page = 4KB)
-----------------------------------------------------------------------------
nPools=1:
- askedFor: 183,232,046
- alloc'd:  205,996,032 (1.12x)
- lwaste:     5,683,824
- swaste:    12,842,208
- nwaste:     2,889,560
- owaste:     1,348,394

nPools=4:
- askedFor: 183,471,893
- alloc'd:  193,081,344 (1.05x)
- lwaste:     5,653,303
- swaste:     2,406,968
- nwaste:       195,002
- owaste:     1,354,178

nPools=16:
- askedFor: 183,827,960
- alloc'd:  191,299,584 (1.04x)
- lwaste:     5,650,322
- swaste:       398,320
- nwaste:         2,286
- owaste:     1,420,696

-----------------------------------------------------------------------------
Windows current (chunk = 64KB, page = 64KB)
-----------------------------------------------------------------------------
nPools=1:
- askedFor: 183,663,492
- alloc'd:  218,955,776 (1.19x)
- lwaste:    12,207,391
- swaste:    13,741,528
- nwaste:     5,471,857
- owaste:     3,871,508

nPools=4:
- askedFor: 184,612,482 (1.09x)
- alloc'd:  201,719,808
- lwaste:    12,207,391
- swaste:     1,128,904
- nwaste:             0
- owaste:     3,771,031

nPools=16:
- askedFor: 183,506,524
- alloc'd:  199,688,192 (1.09x)
- lwaste:    12,207,391
- swaste:        44,584
- nwaste:             0
- owaste:     3,929,693

-----------------------------------------------------------------------------
Linux improved (chunk = 64KB, page = 4KB)
-----------------------------------------------------------------------------
nPools=4:
- askedFor: 183,216,900
- alloc'd:  188,682,240 (1.03x)
- lwaste:       677,151
- swaste:     1,014,128
- nwaste:             0
- owaste:     3,774,061

Attachment #494946 - Flags: review?(jseward)

Julian Seward [:jseward]

Comment 2

•

14 years ago

Looks good.  One comment:

+        // Try to fit in an existing small allocator
+        for (size_t i = 0; i < m_smallAllocationPools.length(); i++) {
+            ExecutablePool *pool = m_smallAllocationPools[i];
+            if (n < pool->available()) {
+                pool->addRef();
+                return pool;
+            }
+        }

Here you've got a first-fit strategy.  A bit of casual web surfing
suggests that worst-fit (check all chunks, use the one with the
largest free space) results in less fragmentation overall.  Did you
try that?

Nicholas Nethercote [inactive]

Assignee

Comment 3

•

14 years ago

(In reply to comment #2)
> 
> Here you've got a first-fit strategy.  A bit of casual web surfing
> suggests that worst-fit (check all chunks, use the one with the
> largest free space) results in less fragmentation overall.
> Did you try that?

No, good suggestion.  Thinking about it now, I would have thought best-fit (use the one with the least free space) would work better, the idea being that big allocations cause the most problems, so catering for them gives the best results.  This situation is a little different to the usual situation because it's a pool and so memory isn't being freed again... I'll try both.

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

14 years ago

Best-fit is the easy winner:

first-fit
- askedFor: 183,975,061
- alloc'd:  189,534,208 (1.03x)
- lwaste:       644,671
- swaste:     1,027,288
- nwaste:             0
- owaste:     3,887,188

best-fit
- askedFor: 183,942,189
- alloc'd:  188,354,560 (1.02x)
- lwaste:       644,671
- swaste:        30,992
- nwaste:             0
- owaste:     3,736,708

worst-fit
- askedFor: 183,838,374
- alloc'd:  196,546,560 (1.07x)
- lwaste:       644,671
- swaste:     7,972,376
- nwaste:             0
- owaste:     4,091,139

This doesn't surprise me.  In this allocator, we waste the most space when we throw away a small pool that still has lots of space left in it, which happens when we do a moderately large allocation.  By using best-fit, we find the smallest hole that'll fit our required size, leaving bigger holes open for later.

Whereas in a normal allocator, unallocated space is never discarded, so small holes become a problem, so worst-fit works well because it avoids creating small holes.

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

14 years ago

Attached patch patch v2 (against TM 58218:0b53fd11e374) (obsolete) (deleted) — Details — Splinter Review

The same patch, this time using best-fit to choose a small pool.

Attachment #494946 - Attachment is obsolete: true

Attachment #495444 - Flags: review?(jseward)

Attachment #494946 - Flags: review?(jseward)

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

14 years ago

Attached patch patch v3 (against TM 58218:0b53fd11e374) (deleted) — Details — Splinter Review

This patch has a more concise formulation of the best-fit search.  It also changes |n < pool->available()| to |n <= pool->available()| in the best-fit search, which seems reasonable.

Attachment #495444 - Attachment is obsolete: true

Attachment #495446 - Flags: review?(jseward)

Attachment #495444 - Flags: review?(jseward)

Julian Seward [:jseward]

Updated

•

14 years ago

Attachment #495446 - Flags: review?(jseward) → review+

David Mandelin [:dmandelin]

Updated

•

14 years ago

blocking2.0: --- → beta9+

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/8921e3faccd2

Whiteboard: fixed-in-tracemonkey

Robert Sayre

Comment 8

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/8921e3faccd2

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

JM: reduce fragmentation in ExecutableAllocator

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(1 file, 2 obsolete files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Updated

Comment 7

Comment 8

Attachment

General

Description

File Name

Content Type