addMessage in reducers/messages.js has costly by Map constructor
Categories
(DevTools :: Console, defect, P2)
Tracking
(firefox99 fixed)
Tracking | Status | |
---|---|---|
firefox99 | --- | fixed |
People
(Reporter: Harald, Assigned: nchevobbe)
References
(Blocks 1 open bug)
Details
(Whiteboard: dt-console-perf-2022)
Attachments
(2 files, 1 obsolete file)
In the scripts spends 10 seconds creating Map instances and adding pairs. https://perfht.ml/2n1nRSa I guess the overhead is GC/Nursery, as discussed in bug 1414385, but I just wanted to make sure to have this filed if there is more that can be optimized here.
Reporter | ||
Comment 1•7 years ago
|
||
ni? sfink to confirm if the reason is just nursery overflow.
Assignee | ||
Comment 2•7 years ago
|
||
I think patch from https://bugzilla.mozilla.org/show_bug.cgi?id=1425577 will improve this
Updated•7 years ago
|
Comment 3•7 years ago
|
||
Hm, I'm confused by the profile. In the marker view, I see some long major GCs (3sec). Lots of minor GCs, but not a huge amount of total time. In the call tree view, I don't see the GCs at all. Instead, it's just complaining about non-GC Map population. There's some time in the pre-barrier, but it's not much. I need to talk to a profiler person about that confusion. (If I double-click on the 3sec major GC, it zooms into a time range that doesn't seem to have much GC going on. Perhaps the marker time correspondences are off?) Anyway, the bulk of the time does appear to be Map population. I think it's probably largely from the code pattern newState.messagesById = (new Map(newState.messagesById)).set(newMessage.id, addedMessage); which looks to me to be a faithful implementation of immutable data which seems like it'll be O(n²) time. (If all versions were kept live, it would be O(n²) space as well, but I would expect all of the old versions to be immediately dead, so it's "just" O(n²) allocations that the GC has to clean up, probably in O(n²) time because I think this needs finalizers so the nursery won't save us.) Which suggests to me that comment 2 is correct about the way to "fix" this, though it does bring to mind speculation as to how this pattern could be implemented efficiently. I think I'll fork off another bug: bug 1432565.
Reporter | ||
Comment 4•7 years ago
|
||
> I need to talk to a profiler person about that confusion. (If I double-click on the 3sec major GC, it zooms into a time range that doesn't seem to have much GC going on. Perhaps the marker time correspondences are off?)
Can you share a link of your selection?
Comment 5•7 years ago
|
||
I'm pretty sure it was confusion on my side. The 3sec GCMajor meant that it was a major GC spanning 3sec; reading the popup further, I see that "Total slice times" is substantially less (221ms for a 2.2sec GC in the example below). I'm asking pbone for clarification, but if I'm correct, I will file a bug suggesting that this be described as "221ms major GC spanning 2261ms" or perhaps more tersely "221ms over 2261ms MajorGC". 2.2sec example: https://perf-html.io/public/f85644116457dc4dff2a96e5dd71d06483d96f3a/marker-chart/?hiddenThreads=&range=7.7472_11.2689&thread=0&threadOrder=0-2-3-4-1&v=2
Comment 6•7 years ago
|
||
Steve is right, the GC markers show the elapsed time of the GC event. not the time spent in the GC. Also the slice time doesn't show all the time spent in the GC since some time is off-thread. I find profiles like this very interesting, because (as a fan of FP & immutability) I'd like to be able to handle them well. It's totally fine that there are 1000s of GCMinor events (given the allocation rate). Notice that the GC has tuned the nursery size really low because very few objects get tenured. Where we could improve this profile is in the speed the Nursery can be collected, it looks like poisioning the nursery is taking up a fair amount of time there, Disabling that will give you a more realistic profile. Next to that I think (but I can't see it here) tenuring an object should usually take the most time. it's interesting that although exactly 704 bytes are tenured each collection, the collection times can range from 0.5ms to about 3ms.
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Reporter | ||
Comment 7•5 years ago
|
||
Related bug 1599604 saved off 30% off long hangs in the Network panel.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 8•3 years ago
|
||
I replaced the Map
s with plain object and damp reported no improvement.
I think the changes in the webconsole reducers + potential Spidermonkey improvements made this issue irrelevant
Assignee | ||
Comment 9•3 years ago
|
||
Re-opening this as we're still seeing Map constructor taking some time in https://profiler.firefox.com/public/e0md8rf8vhybx5xhvh9ze5w3s0e5njzfg9nbyt0/calltree/?globalTrackOrder=0w2&implementation=js&localTrackOrderByPid=34276-0&thread=0&timelineType=cpu-category&transforms=df-229~df-298~df-839~df-459~df-3090~df-837~df-278&v=6 (+GCs in other places)
So any improvement might not be picked up by DAMP in the current state of the console, but might be more visible once we have Bug 1753177 and we want to shave-off any other "small" issues to make the console fast
Or maybe DAMP don't stress this path enough
Assignee | ||
Updated•3 years ago
|
Comment 10•3 years ago
|
||
Could we possibly revisit the usage of cloneState
:
https://searchfox.org/mozilla-central/rev/072f9e6b7f10a00e12d0a02ac713431d0a7ee368/devtools/client/webconsole/reducers/messages.js#91-106
First it looks weird as it duplicates all the attributes of state
, no matter if we only actually really modify some of them.
Then, no matter what we do (Map versus Object), we are going to ask the engine to duplicate very large lists.
So, while we might find a slightly faster API to clone massive list of objects, this may always be a performance bottleneck.
Could we use mutable objects? At least for all the list of messages, or very long arrays/maps/objects we are using here?
Note that I would be fine if this critical part of the console go off from typical React/Redux principles as they rarely scale well with infinite lists.
Assignee | ||
Comment 11•3 years ago
|
||
(In reply to Alexandre Poirot [:ochameau] from comment #10)
Could we possibly revisit the usage of
cloneState
:
https://searchfox.org/mozilla-central/rev/072f9e6b7f10a00e12d0a02ac713431d0a7ee368/devtools/client/webconsole/reducers/messages.js#91-106First it looks weird as it duplicates all the attributes of
state
, no matter if we only actually really modify some of them.
Then, no matter what we do (Map versus Object), we are going to ask the engine to duplicate very large lists.
So, while we might find a slightly faster API to clone massive list of objects, this may always be a performance bottleneck.
Could we use mutable objects? At least for all the list of messages, or very long arrays/maps/objects we are using here?
react-redux does need different object references to fire a new render, so some of the object creation we do are mandatory.
With that being said, I don't think there's benefit in mutating messagesById
, as it only serve as a storing Map which we get the messages info from. What would be mandatory to clone, I think, is the visibleMessages
array (and I think arrays are much more optimized)
That's something we talked about in the past, but some folks were concerned about not respecting the "Redux-way". I feel like we don't have to be dogmatic if we get improvement in the end without hurting readibility/conventions too much (and we'd document it)
Assignee | ||
Comment 12•3 years ago
|
||
The property is renamed, prefixed with mutable
to better highlight that it is
special; the associated selector is renamed as well.
The only change necessary was adding a messageCount
prop to the ConsoleOutput
component so its lifecycle method could check if the number of messages changed.
Comment 13•3 years ago
|
||
(In reply to Nicolas Chevobbe [:nchevobbe] from comment #11)
What would be mandatory to clone, I think, is the
visibleMessages
array (and I think arrays are much more optimized)
Because of the need to have different references? Could we just wrap visibleMessages
in an object with a generation counter or something?
Assignee | ||
Comment 14•3 years ago
|
||
(In reply to Doug Thayer [:dthayer] (he/him) from comment #13)
(In reply to Nicolas Chevobbe [:nchevobbe] from comment #11)
What would be mandatory to clone, I think, is the
visibleMessages
array (and I think arrays are much more optimized)Because of the need to have different references?
yes, react-redux checks references to trigger the component render cycle, and also, some component lifecycle hook (and some middleware) need to compare the "previous" and the "next" state to do some operation (e.g. scroll to bottom when there was a new evaluation result message added).
If you don't have different references, you won't be able to compare those.
Could we just wrap
visibleMessages
in an object with a generation counter or something?
we could probably get creative if we need to. With the patch attached, cloneState
, which was showing up in profile with longish MapConstructorInit
doesn't show up that much
When profiling Adding 10000 messages (only one run, so data might be inaccurate, but still):
- with the mutable Map,
cloneState
takes ~10ms ( profile ) - it was taking ~100ms with the immutable one (profile )
we can see some functions taking a good share of the time spent in the reducer (e.g. getWarningType
, which is why I filed Bug 1754453).
Assignee | ||
Comment 15•3 years ago
|
||
This property is never consumed (and shouldn't be) by any component,
so we can safely mutate it.
Depends on D138293
Comment 16•3 years ago
|
||
(In reply to Nicolas Chevobbe [:nchevobbe] from comment #14)
Could we just wrap
visibleMessages
in an object with a generation counter or something?we could probably get creative if we need to. With the patch attached,
cloneState
, which was showing up in profile with longishMapConstructorInit
doesn't show up that much
After having reviewed your patches I understand all this a bit more.
It looks like messagesById
and visibleMessages
are possibly and typicaly of the same size. Capped at logLimit = 10000 elements.
At first I thought that messagesById
could be very large and contain all the messages received from the server,
but no, we cap it to logLimit.
messagesById
is basically just an helper map to be used alongside visibleMessages
in order to get the reference to the message for each id stored in visibleMessages
.
Is this all right?
Then, I looked at the retained size of one array of 10k elements via the memory panel.
And the retained size is always 131kB if I store numbers or objects in it.
It means that both arrays probably suffer from the same performance penalty.
I draw two conclusions here:
- We could probably merge
visibleMessages
andmessagesById
.
Redux shallow comparison is going to work as well against number and message objects.
The messages have one unique object. So comparing two messages should work fine.
We end up passing these two arrays to the react components for the sake of being able to eventually have access to the message when required.
This force to:
- expose the two lists via selectors,
- pass down the map and the messageId down to all nested react components, whereas we could pass only the message
Note that if that's any helpful we can add an id
attribute to all message objects.
While doing this wouldn't fix anything regarding performance, it could help bullet point 2 by avoiding to optimize the two arrays.
- These two arrays should not be cloned.
They are of the same size in term of number of elements and size in memory.
Doug comment about having a generation counter is a typical way to handle this.
We use a mutable array internaly to the reducer, but the selector will use the generation counter to update itself correctly.
See the profiler project using this in a couple of places:
https://github.com/firefox-devtools/profiler/blob/7e0be227b7a0aeb5935778af89e1c11be657ccdc/src/reducers/profile-view.js#L461-L474
It is basically about having this in the reducer:
ADD_MESSAGE:
state.mutableList.push(newMessage);
state.listGeneration++;
And this in the selector:
getVisibleMessages = createSelector(
state => state.listGeneration,
state => state.mutableList, // <== this will always be the same ref, so only listGeneration increment will force updating the selector.
(generation, list) {
return list;
});
Bonus comments:
- there might be uncovered or useless code in the
removeMessagesFromState
method:
https://coverage.moz.tools/#revision=latest&path=devtools/client/webconsole/reducers/messages.js&view=file&line=937 - the maintenance of
visibleMessages
is quite something :)
I'm wondering if that could be made simplier if we were storing an ordered array of all messages.
IfvisibleMessages
becomes a list of messages, we not longer need the map.
Instead of the map, we could store the list of messages in the order they should be displayed (if they were all displayed). - Still about the maintenant of
visibleMessages
.
setVisibleMessages
looks almost trivial compared to the equivalent code inaddMessage
setVisibleMessages
:
https://searchfox.org/mozilla-central/rev/9f61d854547cedbde0773b2893e4f925352be3b3/devtools/client/webconsole/reducers/messages.js#698-792
addMessage
:
https://searchfox.org/mozilla-central/rev/9f61d854547cedbde0773b2893e4f925352be3b3/devtools/client/webconsole/reducers/messages.js#109-344
I imagine addMessage is significantly more complex as we try to shift in a new message in the existing list, while setVisibleMessage recompute the whole list.
But beyond maintaining a virtual ordered list... is there anything we can do to simplify addMessage?
This sounds like a viewport, may be, while we are in the topic of viewports... we can simplify this code.
Assignee | ||
Comment 17•3 years ago
|
||
(In reply to Alexandre Poirot [:ochameau] from comment #16)
After having reviewed your patches I understand all this a bit more.
It looks like
messagesById
andvisibleMessages
are possibly and typicaly of the same size. Capped at logLimit = 10000 elements.
At first I thought thatmessagesById
could be very large and contain all the messages received from the server,
but no, we cap it to logLimit.
messagesById
is basically just an helper map to be used alongsidevisibleMessages
in order to get the reference to the message for each id stored invisibleMessages
.
Is this all right?
Not really, I think you overlooked some details :)
messagesById
will hold all the resources we are watching. Each resource (message in console lingo), is assigned an id (https://searchfox.org/mozilla-release/rev/30f8a5a2dd07280e4445f566980522fe876bec70/devtools/client/webconsole/utils/messages.js#85). Its size is capped at the value of devtools.hud.loglimit
(10000 by default).
visibleMessages
is an array of all the message ids that need to be displayed in the console output. i.e. There are multiple reasons messages that are in messagesById
wouldn't be in visibleMessages
: the user filtered the output with text, or turned off some categories, Show content messages
is disabled, the message is inside a collapsed console.group
/console.groupCollapsed
/warning Group, …
So we need to have those 2 informations: the pool of all the messages received by all the watched target, and from that list, all the messages that should be displayed in the console output.
Then, I looked at the retained size of one array of 10k elements via the memory panel.
And the retained size is always 131kB if I store numbers or objects in it.
It means that both arrays probably suffer from the same performance penalty.I draw two conclusions here:
- We could probably merge
visibleMessages
andmessagesById
.
Not really. In one of the very early version of the console, we had a visible
propery directly in the message, in messagesById
, and visibleMessages
wasn't a thing. Managing the global map was more challenging in the end, and having smaller properties that are just holding references to messages is much simpler (we have the same things for handling groups, knowing if the message was "expanded" or not, …)
Redux shallow comparison is going to work as well against number and message objects.
The messages have one unique object. So comparing two messages should work fine.
We end up passing these two arrays to the react components for the sake of being able to eventually have access to the message when required.
This force to:
- expose the two lists via selectors,
- pass down the map and the messageId down to all nested react components, whereas we could pass only the message
We only pass those 2 to the ConsoleOutput, then for "message" components, we pass a getMessage
function I think. This was done at some point due to perf reason, but we might revisit that.
Note that if that's any helpful we can add an
id
attribute to all message objects.
I'm not sure what you mean here, since we already have an id
property on messages?
While doing this wouldn't fix anything regarding performance, it could help bullet point 2 by avoiding to optimize the two arrays.
- These two arrays should not be cloned.
They are of the same size in term of number of elements and size in memory.
Doug comment about having a generation counter is a typical way to handle this.
We use a mutable array internaly to the reducer, but the selector will use the generation counter to update itself correctly.See the profiler project using this in a couple of places:
https://github.com/firefox-devtools/profiler/blob/7e0be227b7a0aeb5935778af89e1c11be657ccdc/src/reducers/profile-view.js#L461-L474
It is basically about having this in the reducer:ADD_MESSAGE: state.mutableList.push(newMessage); state.listGeneration++;
And this in the selector:
getVisibleMessages = createSelector( state => state.listGeneration, state => state.mutableList, // <== this will always be the same ref, so only listGeneration increment will force updating the selector. (generation, list) { return list; });
Looking at the profile with the patch attached to this bug (not mutating the Map), we are now spending very little time in cloneState
, but we could definitely have a follow up to not mutate visibleMessages
and see the impact.
Bonus comments:
- there might be uncovered or useless code in the
removeMessagesFromState
method:
https://coverage.moz.tools/#revision=latest&path=devtools/client/webconsole/reducers/messages.js&view=file&line=937- the maintenance of
visibleMessages
is quite something :)
I'm wondering if that could be made simplier if we were storing an ordered array of all messages.
IfvisibleMessages
becomes a list of messages, we not longer need the map.
Instead of the map, we could store the list of messages in the order they should be displayed (if they were all displayed).
In the patch where I was turning messagesById
into an object, I had an other array to store the ids in the order the messages should be in (as objects can't guarantee order). Here we rely on the order of the message in messagesById
, which is then reflected in visibleMessages
. Since we do need those 2 info, it might be an overkill to have yet another property for the order?
- Still about the maintenant of
visibleMessages
.
setVisibleMessages
looks almost trivial compared to the equivalent code inaddMessage
setVisibleMessages
:
https://searchfox.org/mozilla-central/rev/9f61d854547cedbde0773b2893e4f925352be3b3/devtools/client/webconsole/reducers/messages.js#698-792
addMessage
:
https://searchfox.org/mozilla-central/rev/9f61d854547cedbde0773b2893e4f925352be3b3/devtools/client/webconsole/reducers/messages.js#109-344
I imagine addMessage is significantly more complex as we try to shift in a new message in the existing list, while setVisibleMessage recompute the whole list.
But beyond maintaining a virtual ordered list... is there anything we can do to simplify addMessage?
This sounds like a viewport, may be, while we are in the topic of viewports... we can simplify this code.
There's probably work we can do to make addMessage
simpler, move things around a bit, but we need to be careful.
addMessage
is complex, but adding messages are more complex that one might think:
- some messages we receive are just "indicators" on which we need to update the state (e.g.
console.groupEnd
message need to update the current groups,console.clear
for clearing the output, …) - messages could trigger a reordering or grouping (e.g. receiving a second warning that would be part of a warning group)
- message could come not in the right order: e.g. evaluation result of snippet containing
console.log
calls. The logs need to be displayed before the result, but they might come first as we get them through resources, whereas the result is received from the webconsole front. - eventually, removing old messages if we hit the limit (We might revisit this in https://bugzilla.mozilla.org/show_bug.cgi?id=1754996)
Comment 18•3 years ago
|
||
(In reply to Nicolas Chevobbe [:nchevobbe] from comment #17)
Not really, I think you overlooked some details :)
So we need to have those 2 informations: the pool of all the messages received by all the watched target, and from that list, all the messages that should be displayed in the console output.
Thanks for the detailed description.
But be careful about considering that visibleMessages
is smaller than messagesById
.
When you execute the following snippet on a regular webconsole without any particular filter:
for(var i = 0; i < 10000; i++) console.log(i)
Then:
messagesById.size == 10000
and visibleMessages.length == 9999
(weird that both aren't capped at the same limit)
I draw two conclusions here:
- We could probably merge
visibleMessages
andmessagesById
.Not really. In one of the very early version of the console, we had a
visible
propery directly in the message, inmessagesById
, andvisibleMessages
wasn't a thing. Managing the global map was more challenging in the end, and having smaller properties that are just holding references to messages is much simpler (we have the same things for handling groups, knowing if the message was "expanded" or not, …)
Sorry for not being clearer, but my description was about how the React component was using visibleMessages+messagesById.
If we were to pass only a list of visible messages to React (instead of their IDs + the map of ID to messages),
then we could avoid using messagesById from React.
I do understand that the reducer still needs messagesById, or at least, the list of all the unfiltered messages.
This isn't critical for performance, but this would:
- really restrict the usage of this mutable dataset within the scope of the reducer. Now this is clearer that's actually really not used by the UI.
- we would no longer use it on selectors and so the fact that's mutable is no longer an issue
- simplify the react components by removing the getMessage workaround.
We can followup on that, with your current patch at least we address the perf issue for this first data set.
Then, we could revisit the API between React and its reducer.
Note that if that's any helpful we can add an
id
attribute to all message objects.I'm not sure what you mean here, since we already have an
id
property on messages?
I thought these was resource objects coming from ResourceCommand, which do not have an id
attribute.
I missed that we were mapping them to another type of object.
Looking at the profile with the patch attached to this bug (not mutating the Map), we are now spending very little time in
cloneState
, but we could definitely have a follow up to not mutatevisibleMessages
and see the impact.
Note that the cost of Maps is made super visible thanks to some c++ implementation detail.
This intermediate MapConstructorInit
c++ method makes it crystal clear.
So the cost of messagesById cloning is very visible.
While the cost of array duplication seems to be part of the "self duration" of each function doing a copy.
It should be meld into the self duration of messages
and removeMessagesFromState
functions.
See this profile:
https://share.firefox.dev/3rQ2J3n
This records array cloning:
data:text/html,<script>var o = [], x=null; for(var i = 0; i < 1000000; i++) o.push(i); window.onclick=()=>{x = [...o]; }</script>
You will see that it is much harder to find the culprit here. The self duration of onclick is pretty high.
Compared to this other profile with maps:
data:text/html,<script>var o = new Map(), x=null; for(var i = 0; i < 1000000; i++) o.set(i, i); window.onclick=()=>{x = new Map(o); }</script>
https://share.firefox.dev/3Ly66nh
Where MapConstructorInit is super visible.
But this also confirms that cloning arrays is much faster than cloning maps.
Which makes the possible perf bottleneck of visibleMessages also less important than the one related to messagesById.
Not because the two dataset are of different size, but solely because one is an array and the other a map.
- the maintenance of
visibleMessages
is quite something :)
I'm wondering if that could be made simplier if we were storing an ordered array of all messages.
IfvisibleMessages
becomes a list of messages, we not longer need the map.
Instead of the map, we could store the list of messages in the order they should be displayed (if they were all displayed).In the patch where I was turning
messagesById
into an object, I had an other array to store the ids in the order the messages should be in (as objects can't guarantee order). Here we rely on the order of the message inmessagesById
, which is then reflected invisibleMessages
. Since we do need those 2 info, it might be an overkill to have yet another property for the order?
In this suggestion, we no longer need the Map (ID=>message).
So visibleMessages is a list of messages and messagesById become this sorted array of all the messages.
There's probably work we can do to make
addMessage
simpler, move things around a bit, but we need to be careful.
addMessage
is complex, but adding messages are more complex that one might think:
- some messages we receive are just "indicators" on which we need to update the state (e.g.
console.groupEnd
message need to update the current groups,console.clear
for clearing the output, …)- messages could trigger a reordering or grouping (e.g. receiving a second warning that would be part of a warning group)
- message could come not in the right order: e.g. evaluation result of snippet containing
console.log
calls. The logs need to be displayed before the result, but they might come first as we get them through resources, whereas the result is received from the webconsole front.- eventually, removing old messages if we hit the limit (We might revisit this in https://bugzilla.mozilla.org/show_bug.cgi?id=1754996)
I understand messages management is complex. But do you know why setVisibleMessages is significantly simplier than addMessage?
Phrased differently: why it is significantly more complex to add one new message compared to recompute the whole list of visible messages from scratch ?
Or are these 4 bullet points actions that are applied only once against the state objects and then setVisibleMessages no longer have to care about this transient/indicator messages ?
If so. It might be helpful to process these two types of messages distinctly. It would help review that the algorithm to add a new message to visibleMessages is actually matching the algorithm we have to recompute visibleMessage from scratch.
The perfect setup to be in would be to actually reuse lots of code between the two codepath. We already use getMessageVisibility
in common, but we might grow code-reuse.
Updated•3 years ago
|
Assignee | ||
Comment 19•3 years ago
|
||
Depends on D138293
Comment 20•3 years ago
|
||
Pushed by nchevobbe@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6714473d1152 [devtools] Mutate messagesById in console redux state. r=ochameau. https://hg.mozilla.org/integration/autoland/rev/55f151a24885 [devtools] Avoid cloning object in console messages reducer. r=ochameau.
Comment 21•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6714473d1152
https://hg.mozilla.org/mozilla-central/rev/55f151a24885
Description
•