Closed Bug 1253131 Opened 9 years ago Closed 8 years ago

Avoid resizing IPC messages by precomputing the size

Categories

(Core :: IPC, defect, P1)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
e10s + ---
firefox45 --- wontfix
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- wontfix

People

(Reporter: billm, Assigned: mccr8)

References

Details

(Whiteboard: [MemShrink:P2] btpp-backlog)

Attachments

(1 file, 2 obsolete files)

I was doing some testing today of my patch in bug 1235633 and ran into another issue. Let's say we have a message that contains two strings. The first string is 20MB and the other string is 8 bytes. Writing the first string into the message causes the Pickle class to allocate ~20MB:
https://dxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/pickle.cc#523

Then when we go to write the final 8 bytes, we call the same function:
https://dxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/pickle.cc#523
This time, capacity_ is ~20MB and needed_size is ~20MB + 8 bytes. So we allocate 40MB.

I know that doubling is usually a good strategy for allocation, but I feel like it doesn't make a ton of sense here. Note that we also want to handle the case where we're writing tons of small items to a Message.

Nick, do you have any ideas about what to do here? I know you've worked on bugs like this in the past.
Flags: needinfo?(n.nethercote)
Whiteboard: [MemShrink]
The JS engine has a hybrid approach for slots, IIRC: double up to a certain size (e.g. 1 MiB) and then increase by 1.125x beyond that. So that's one possibility.

I'd suggest doing some ad hoc profiling (with printfs) to see what kind of patterns occur frequently in practice. That might give you some other ideas.
Flags: needinfo?(n.nethercote)
Can't we have the ipdl generated code inspect the string lengths ahead of time and then call Pickle::Resize() before doing the writes?
(I'm marking backlog but that shouldn't give an impression of priority or anything)
Whiteboard: [MemShrink] → [MemShrink] btpp-backlog
I'll take a look. billm, any suggestions on how to reproduce this occurrence?
Assignee: nobody → n.nethercote
Flags: needinfo?(wmccloskey)
Whiteboard: [MemShrink] btpp-backlog → [MemShrink:P2] btpp-backlog
I did it pretty artificially. At this point:
https://hg.mozilla.org/mozilla-central/file/4657041c6f77/dom/ipc/ContentParent.cpp#l2518
I changed the buildID string to be much bigger. I think I just concatenated it with itself 20 or 30 times. Then I looked for any large mallocs.

Ben is right, we could make a pass over the data to send and compute the exact size. It would be a lot of work though. Every custom ParamTraits would have to implement a new method to compute size, I think. Example:
https://hg.mozilla.org/mozilla-central/file/4657041c6f77/chrome/RegistryMessageUtils.h#l71
Flags: needinfo?(wmccloskey)
Blocks: e10s-oom
I don't know that doing the exact size would be that hard.  We basically add support for the primitives types and then add support for a routine that iterates over children sizes.  Seems pretty straightforward other than having to muck with the ipdl code generator.

I would work on this, but I honestly don't know when I will have time.
That'll work for writes but not reads. Maybe there's something else that could do that. (Or already does, I haven't looked at Pickle in great detail.)
Well, I thought this was about writes based on the links in comment 0.

I'm not sure why we would be writing bytes into a variable length buffer for the reading case.  It seems data should be able to get serialized into structures directly from a fixed sized read buffer.

If the wire protocol does not include length information, then that is insane and we should fix it. IMO.
(In reply to Ben Kelly [:bkelly] from comment #6)
> I don't know that doing the exact size would be that hard.  We basically add
> support for the primitives types and then add support for a routine that
> iterates over children sizes.  Seems pretty straightforward other than
> having to muck with the ipdl code generator.

I don't really see how that would work. We have lots of custom serializers in the tree (search for ParamTraits to see them all). I think we would have to update all of those, which would be a lot of work.
Ok, I forgot about custom serializers.  But still, those extra methods should be basically GetSize() + GetSize() + GetSize().  Tedious, but not difficult.
(In reply to Ben Kelly [:bkelly] from comment #8)
> If the wire protocol does not include length information, then that is
> insane and we should fix it. IMO.

There's a length; see Pickle::Header.

(In reply to Bill McCloskey (:billm) from comment #9)
> I don't really see how that would work. We have lots of custom serializers
> in the tree (search for ParamTraits to see them all). I think we would have
> to update all of those, which would be a lot of work.

Having the length method return a Maybe<uint32_t> and having a fallback to dynamic allocation would allow adding the methods incrementally.
(In reply to Jed Davis [:jld] from comment #11)
> Having the length method return a Maybe<uint32_t> and having a fallback to
> dynamic allocation would allow adding the methods incrementally.

Yeah, it is only a big deal for big messages (at least in terms of memory usage). And telemetry should help us tell which messages are big.
OK, that's a good point. Bug 1260908 should tell us what we need to care about (although it only starts recording at 8K; we might want to tune that).
tracking-e10s: --- → ?
Summary: Messages can end up allocating much more space than needed → IPC messages can end up allocating much more space than needed
Priority: -- → P1
Maybe this is overly complex or wrong, but here's a prototype of part of what you need for bkelly's idea in comment 2.

This defines some template stuff so that you can do |ComputeSizeOrNot::Get(3, "4", 5);|, and it will calculate the size of each argument by calling ParamTraits::Size() on them all and summing it up, or will evaluate to Nothing() without examining any argument if any argument has no Size() defined. This would make it easier to generate the IPDL code to precomputed message sizes.
Assignee: n.nethercote → nobody
Assignee: nobody → continuation
I hacked up a local version of bug 1260908 and ran it on YouTube, and PHttpChannel::Msg_OnTransportAndData seemed like the most common thing, so I'll use that as my initial focus.
(In reply to Andrew McCreight [:mccr8] from comment #14)
> This defines some template stuff so that you can do
> |ComputeSizeOrNot::Get(3, "4", 5);|, and it will calculate the size of each
> argument by calling ParamTraits::Size() on them all and summing it up, or
> will evaluate to Nothing() without examining any argument if any argument
> has no Size() defined. This would make it easier to generate the IPDL code
> to precomputed message sizes.

I think this function should return something even if an argument has no Size() method defined, so Size() is more like a lower bound on how much space a message could take up.  So:

  ComputeSize::Get(3, "4", 5);

would return 8 even if we didn't know how to compute the size of a string.  Doing this makes it more robust.
(In reply to Nathan Froyd [:froydnj] from comment #16)
> Doing this makes it more robust.

My concern with doing this is that underestimating by a little bit will cause clownshoe problems (ie internal fragmentation could end up being around 50%), so we'd be better off with doubling (where we would expect the average internal fragmentation to be around 25%).
Attachment #8737965 - Attachment is obsolete: true
Blocks: 1262918
No longer blocks: e10s-oom
Just looking at PHttpChannelParent::SendOnTransportAndData(), the current heuristic seems to generally work well. This message has only a single large data field, followed by a few smaller ones, so usually it just gets resized up to the right size when we write the string. There's a little bit of slop for some reason, so most of the time we can finish out writing the message.

However, it isn't too uncommon to end up with some messages like this:
  actual size = 31496; actual capacity = 62976; malloc size of = 65536
  actual size = 260036; actual capacity = 520064; malloc size of = 520192

As expected, my prototype for this patch eliminates these.

Of course, a much easier way to fix this (most of the time) would be to reorder the arguments, so that the big string is last! In fact, that's such an easy fix I should probably just do that first, for this and other common messages. "A big chunk of data and a few little fields" seems like a common pattern. In fact, just reordering the arguments to OnTransportAndData() so that the data arg is last eliminates almost all of the IPC::Message doubling I see when loading www.cnn.com.
Andrew, this is exactly what I was thinking we should be doing here. Is there any reason *not* to take this now?
Flags: needinfo?(continuation)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #20)
> Is there any reason *not* to take this now?

I've been working on this. Note that bug 1263235 will get us much of the benefit of this bug. (Also, at most this will cut memory usage of IPC messages in half, which would help but probably isn't enough.)
Flags: needinfo?(continuation)
(In reply to Andrew McCreight [:mccr8] from comment #21)
> (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #20)
> > Is there any reason *not* to take this now?
> 
> I've been working on this. Note that bug 1263235 will get us much of the
> benefit of this bug. (Also, at most this will cut memory usage of IPC
> messages in half, which would help but probably isn't enough.)

Yup, bug 1263235 will certainly help, but I'm concerned about people adding more messages going forward that don't follow that pattern
This seems to work, at least with some light local testing on PHttpChannelParent::SendOnTransportAndData(). More Size() methods need to be defined in order for it to work for other interesting messages like PBrowserStreamParent::SendWrite and the various SendAsyncMessage. The latter involves CpowEntry, so further IPDL codegen work to support Size() IPDL unions will probably be needed. Assuming those ParamTraits are codegenned.
Attachment #8738260 - Attachment is obsolete: true
(In reply to Andrew McCreight [:mccr8] from comment #23)
> Created attachment 8740162 [details] [diff] [review]
> Prototype of automatically pre-sizing IPC messages.
> 
> This seems to work, at least with some light local testing on
> PHttpChannelParent::SendOnTransportAndData(). More Size() methods need to be
> defined in order for it to work for other interesting messages like
> PBrowserStreamParent::SendWrite and the various SendAsyncMessage. The latter
> involves CpowEntry, so further IPDL codegen work to support Size() IPDL
> unions will probably be needed. Assuming those ParamTraits are codegenned.

IPDL doesn't codegen ParamTraits, but uses method overloading to do something sort of similar.  Codegenning ParamTraits in IPDL would be fantastic, though; it'd get rid of special cases and a somewhat confusing scheme.
Going back to the growth ratio of containers.  For containers of at least size 4096 bytes, I think a smaller growth ratio of 1.3125 or 1.46484375 is better to prevent fragmentation.  1.3125 ratio can be calculated with:

capacity += (capacity>>2) + (capacity>>4)

1.46484375 ratio can be calculated with:

capacity += (capacity>>1) - (capacity>>5) - (capacity>>8)

Growth ratio 1.3125 is true for this equation:

1 + ratio >= ratio^3

1 + 1.3125 >= 2.26

Which means when growing a container, when doing the next allocation, the size of the next allocation is smaller than the sum of the two prior allocations, so the space used by the prior two allocations can be reused for the next allocation.

Here is the sequence:
4096 5376 7056 9261 12154 15951 20934 27475 36060 47328 62118 81529
107006 140444 184332 241935 317538 416768 547008 717948 942306 1236776
1623268 2130539 2796331 3670183 4817114 6322461 8298229 10891425
14294995 18762180 24625361 32320786 42421031 55677602 73076852
95913368 125886295 165225761 216858811 284627188 373573184 490314804
643538180 844643861 1108595067 1455031024 1909728219

So, when growing from capacity 7056 to 9261, 9261 is less than (4096+5376).

Alternatively, for 1.46484375 (1 + 2^-1 - 2^-5 - 2^-8), it meets the equation:

1 + ratio + ratio^2 >= ratio^4 

So, the next allocation is smaller than the prior three allocations.  The sequence:

4096 6000 8790 12877 18863 27632 40478 59295 86859 127235 186379
273016 399927 585831 858151 1257057 1841392 2697353 3951201 5787892
8478359 12419472 18192587 26649298 39037059 57183192 83764443
122701822 179738999 263289550 385678053 564958086 827575322 1212268539
1775783994

Any ratio from 1 < x < Golden Ratio would allow reuse of old allocations when growing a container, but I think the two ratios I gave are worth evaluating.  Any ratio >= the Golden Ratio cannot reuse old allocations when growing a container.

For small allocations, sizes less than 4096 bytes, the allocator already has separate pages for each size class.

Program that generated the sequence:

import textwrap

capacity = 4096
c = list()
while capacity < (1<<31):
        c.append(capacity)
        capacity += (capacity>>2) + (capacity>>4)
        #capacity += (capacity>>1) - (capacity>>5) - (capacity>>8)

for s in textwrap.wrap(' '.join(str(i) for i in c)):
    print s
for i in xrange(3, len(c)-1):
    assert c[i-2] + c[i-1] >= c[i+1]
    #assert c[i-3] + c[i-2] + c[i-1] >= c[i+1]
Thanks for the comment. I filed bug 1263953 about reducing the growth rate, so we can leave this bug to be about exact sizing. Both are probably worth doing. Note that above 1MB it probably doesn't matter too much what the growth is because we're going to start crashing due to a lack of contiguous address space on 32-but systems. :)
32-bit, I mean.
Summary: IPC messages can end up allocating much more space than needed → Avoid resizing IPC messages by precomputing the size
(In reply to Andrew McCreight [:mccr8] from comment #26)
> Note that above 1MB it probably doesn't matter too much what the
> growth is because we're going to start crashing due to a lack of contiguous
> address space on 32-but systems. :)

Would it make sense to split up these big messages to avoid this problem? These enormous messages and huge allocations are giving me stress...
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #28)
> Would it make sense to split up these big messages to avoid this problem?
> These enormous messages and huge allocations are giving me stress...

Yes, where possible. I filed bugs for individual common large messages, also blocking bug 1262918.
I've been working on this, but I've gotten bogged down in template stuff. I talked to Bill about this, and if he can make bug 1262671 work, then this becomes mostly a non-issue, so I'm going to put this aside for now to focus on things like analyzing our telemetry and crash info to see if there might be some other issue besides big messages.
I'll reopen later if this seems worthwhile.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: