Closed Bug 1565405 Opened 5 years ago Closed 5 years ago

Async actor construction failures don't notify anywhere

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla70

Tracking Flags:

Tracking

Status

firefox70

---

fixed

People

(Reporter: mattwoodrow, Assigned: nika)

References

Details

Attachments

(1 file)

Bug 1565405 - Fail async ctor call if `!Manager()->CanSend()`, 5 years ago Nika Layzell [:nika] (ni? for response) (deleted), text/x-phabricator-request		Details

Matt Woodrow (:mattwoodrow)

Reporter

Description

•

5 years ago

As discovered in bug 1562616, returning nullptr from an AllocPActor function (for an async ctor) doesn't cause any failures.

It looks like the generated code in OnMessageReceived just returns a MsgValueError, MaybeHandleError prints that to stderr (not in crash reports), and then the default ProcessingError() impl does nothing with it.

This means that by default, failing to construct an async actor isn't observable in any way.

Bug 1191976 made us crash the receiving process whenever we fail to deserialize a message, which includes the case where we had a reference to the actor that failed to create.

This makes it really easy to crash the parent by using an actor that wasn't actually valid, and the resulting crash report only points to the first use of it.

Should we also crash the parent when the actor fails to create, so that we get better stacks for the actual cause?

Matt Woodrow (:mattwoodrow)

Reporter

Updated

•

5 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1562616

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Updated

•

5 years ago

Flags: needinfo?(nika)

Nika Layzell [:nika] (ni? for response)

Assignee

Comment 1

•

5 years ago

Unfortunately IPC is full of footguns like this, and I'm still trying to figure out how to handle them in the general case.

We really don't want to crash when something fails to deserialize, ideally we'd just kill the content process. Unfortunately, as bug 1191976 found, that causes issues where we get useless content process crash reports with no info about what failed or why. If there was a way to annotate the content process' crash report before crashing it from the parent process with the relevant info, it would make for a much better experience all-around.
Being able to send actor references over IPC is definitely something I'm still struggling with out to make non-crashy. :jld and I have been talking about requiring PFoo* actors to be sent within some sort of FallibleActor<PFoo> wrapper type (which would be implicitly-converted to from PFoo*), requiring you to unwrap past an error type. The error would be present if the actor doesn't exist in the parent process anymore, making this sort of thing not a crash unless it needs to be. Errors like the content process straight-up lying about stuff (like the type of the actor) would probably still be deserialization failures.

This sort of fallible serialization might also be useful for deserializing dead BrowsingContext objects.
Async actor ctors used to crash in that sort of situation (sometimes), but it was unreliable whether they would crash, and they could crash unpredictably (e.g. it was possible to get a racy crash during shutdown). We generally tried to move away from crashing because it produced unexpected results and was often responsible for really frustrating user crashes. Unfortunately it doesn't interact well with constructing managed actors of already-destroyed actors. Perhaps we need to make that fail more aggressively again?
We should probably crash as we send a destroyed IPDL actor - wouldn't be too hard to do. Perhaps that's what we can do in this bug.

Flags: needinfo?(nika)

Nika Layzell [:nika] (ni? for response)

Assignee

Comment 2

•

5 years ago

ni? :mattwoodrow in case he has any opinions here.

Flags: needinfo?(matt.woodrow)

Nika Layzell [:nika] (ni? for response)

Assignee

Comment 3

•

5 years ago

Attached file Bug 1565405 - Fail async ctor call if `!Manager()->CanSend()`, (deleted) — Details

The previous behaviour of failing unconditionally was performed as, during
shutdown, the channel could become unable to send without worker threads having
a chance to react. This change keeps that behaviour, isolating async message
senders from impending IPC shutdown, while performing expected actor teardown if
the manager actor has already been destroyed, and should no longer send messages.

An alternate behaviour here could be to crash if !Manager()->CanSend(). That
behaviour may be preferable if a sufficient number of callsites don't check the
return value of the SendPFooConstructor() method.

Pulsebot

Comment 4

•

5 years ago

Pushed by nlayzell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cae8c48da596 Fail async ctor call if `!Manager()->CanSend()`, r=froydnj

Neha Kochar [:neha]

Updated

•

5 years ago

Assignee: nobody → nika

Status: NEW → ASSIGNED

Priority: -- → P2

Neha Kochar [:neha]

Updated

•

5 years ago

Type: enhancement → defect

Matt Woodrow (:mattwoodrow)

Reporter

Comment 5

•

5 years ago

Solutions 1 and 2 both sound great, and also a lot of work.

I don't think the current patch would have helped for bug 1562616 since in that case the constructor case failed on the receiving side. We end up in a weird mismatched state where the sending side thinks it has a fully constructed actor, and the receiving side thinks that it doesn't. Next time we try to send across a reference to this zombie-actor, the receiving side crashes as it doesn't exist.

Flags: needinfo?(matt.woodrow)

Cristina Coroiu [:ccoroiu]

Comment 6

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/cae8c48da596

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox70: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla70

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Async actor construction failures don't notify anywhere

Categories

(Core :: IPC, defect, P2)

Tracking

()

People

(Reporter: mattwoodrow, Assigned: nika)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Updated

Comment 5

Comment 6

Attachment

General

Description

File Name

Content Type