Closed Bug 790527 Opened 12 years ago Closed 12 years ago

When rebooting/powering off, go through normal gecko shutdown

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla18

Project Flags:

blocking-basecamp

People

(Reporter: cjones, Assigned: airpingu)

References

Details

(Whiteboard: [LOE:S])

Attachments

(1 file, 6 obsolete files)

WIP Patch 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch	espindola : feedback-	Details \| Diff \| Splinter Review
Patch 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch		Details \| Diff \| Splinter Review
Patch 1.1 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch		Details \| Diff \| Splinter Review
Patch, V2 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch	cjones : superreview+	Details \| Diff \| Splinter Review
Patch, V3 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch	airpingu : superreview+	Details \| Diff \| Splinter Review
Patch, V3.1 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch	benjamin : review+ airpingu : superreview+	Details \| Diff \| Splinter Review
Patch, V3.2 12 years ago Gene Lian (I already quit Mozilla) (deleted), patch	airpingu : review+ airpingu : superreview+	Details \| Diff \| Splinter Review

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Description

•

12 years ago

Currently, the reboot()/poweroff() APIs just directly call the hal interfaces. These interfaces in turn call the bare linux interface. That's really bad, because we don't even sync the filesystems before shutting down. Instead, we should extend the nsIAppStartup:quit interface to take two more shutdown types: REBOOT and POWEROFF. After going through the quit(), we will finish the process by calling the right hal API. We should also verify as part of this work that the hal:: implementation is doing the right thing with filesystem shutdown. Gene, want to take this? blocking+ because of possibility for data loss.

Gene Lian (I already quit Mozilla)

Assignee

Updated

•

12 years ago

Assignee: nobody → clian

Gene Lian (I already quit Mozilla)

Assignee

Updated

•

12 years ago

Blocks: 787436, b2g-gecko-updates

Gene Lian (I already quit Mozilla)

Assignee

Updated

•

12 years ago

Whiteboard: [LOE:M]

Gene Lian (I already quit Mozilla)

Assignee

Updated

•

12 years ago

Whiteboard: [LOE:M] → [LOE:S]

Gene Lian (I already quit Mozilla)

Assignee

Comment 1

•

12 years ago

Gene Lian (I already quit Mozilla)

Assignee

Comment 2

•

12 years ago

Attached patch WIP Patch (obsolete) (deleted) — Details — Splinter Review

Hi :bsmedberg, I'd like to invite you to have a quick review on this patch, where it seems you're the rightest person to review this nsIAppStartup.quit() logic. :) The main purpose is to use .quit(ePowerOff) or .quit(eReboot) to normally close all the windows(apps) before powering off or rebooting the B2G mobile. Note that this is still a WIP but everything works well. Just hoping to have more tests and confirm with you the correctness of the .quit() logic. Please let me know if you have any suggestions or questions. Thanks!

Attachment #661176 - Flags: feedback?(benjamin)

Benjamin Smedberg

Comment 3

•

12 years ago

Comment on attachment 661176 [details] [diff] [review] WIP Patch I'm going to delegate this to respindola: he's working on bug 662444 (shutdown by _exit instead of doing full shutdown) and I believe this should be designed with that in mind: in particular we really don't need or want to do a full shutdown here, just enough to sync the user profile and perhaps make sure that the filesystem is safe; definitely don't want to do XPCOM shutdown.

Attachment #661176 - Flags: feedback?(benjamin) → feedback?(respindola)

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Comment 4

•

12 years ago

My feedback is that once bug 662444 is fixed, this is probably what we will want to do. All exits in a release build will by default do the lest possible work. The problem is that fully fixing 662444 will take a long time. We will be moving back _exit one write at a time, and some are going to be hard to fix (might even be produced by extensions). If this is a r+ or r- probably depends on the timeline you have in mind. If you can live with a simpler shutdown at this point, I would suggest holding this patch until regular shutdown is sufficiently fast.

Benjamin Smedberg

Comment 5

•

12 years ago

This is "B2G now" timeline. As long as we can make this not affect Firefox, we should try to get B2G doing the fastest possible shutdown using this and fast _exit.

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Comment 6

•

12 years ago

(In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > This is "B2G now" timeline. As long as we can make this not affect Firefox, > we should try to get B2G doing the fastest possible shutdown using this and > fast _exit. OK, r- from me then. Finding out what is needed for just a b2g shutdown is probably easier. We can always refactor the code once desktop shutdown is fast.

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Updated

•

12 years ago

Attachment #661176 - Flags: feedback?(respindola) → feedback-

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 7

•

12 years ago

This is 100% orthogonal to the fast-_exit() work.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 8

•

12 years ago

(In reply to Rafael Ávila de Espíndola (:espindola) from comment #6) > (In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > > This is "B2G now" timeline. As long as we can make this not affect Firefox, > > we should try to get B2G doing the fastest possible shutdown using this and > > fast _exit. > > OK, r- from me then. Finding out what is needed for just a b2g shutdown is > probably easier. We can always refactor the code once desktop shutdown is > fast. This is not a very useful f- because it's unclear what information you're requesting. Can you please be more specific?

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Comment 9

•

12 years ago

Sure. Correct me if I am wrong, but I am getting the impression that * appStartup->Quit is being called to produce a regular shutdown (including XPCOM shutdown). * bsmedberg thinks that we should be doing the fastest shutdown we can. The only thing I think I can add to the discussion is that the work on _exit(0) is being slow, so it will take a while for a regular shutdown to avoid all unnecessary work and so appStartup->Quit is probably not what you want.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 10

•

12 years ago

The purpose of this bug is 100% *not* to optimize shutdown time. It 100% doesn't matter here. Please don't add to this work a dependency on the _exit(0) optimization. If the app-exit API will change in support of that orthogonal work (which it really shouldn't, IMHO), then that new API will need to support the two new use cases here (reboot/OS-shutdown).

Benjamin Smedberg

Comment 11

•

12 years ago

What is the purpose of this bug? nsIAppStartup.quit doesn't seem to be necessary if all you're doing it making sure the filesystem syncs, and it's not sufficient if you actually need to shut down gecko; it shuts down the event loop but without additional work it doesn't shut down the profile (for example) or make sure that gecko has stopped writing data to disk. It seems that nsIAppStartup.quit is orthagonal to whatever goal the bug was actually filed for.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 12

•

12 years ago

Let's call the work that needs to be done to shut down gecko "X". Currently, shutdown looks like 1. call nsIAppStartup.quit(how) 2. do X 3. implement "how" from step (1): restart or exit The goal of this bug is to add two new options for (1) and (3): "reboot" and "poweroff". That work has nothing to do with X, for example making it faster. Other code uses appStartup.quit() to quit gecko's process. If that's not the right way, what is?

Benjamin Smedberg

Comment 13

•

12 years ago

The normal gecko shutdown sequence is: * exit the event loop. End up here: http://hg.mozilla.org/mozilla-central/annotate/9d8b3ee45146/toolkit/xre/nsAppRunner.cpp#l3911 * fire all the shutdown notifications in order, https://wiki.mozilla.org/XPCOM_Shutdown ** This technically happens by first calling nsXREDirProvider::DoShutdown and then NS_ShutdownXPCOM from here: http://hg.mozilla.org/mozilla-central/annotate/9d8b3ee45146/toolkit/xre/nsAppRunner.cpp#l1109 Then there's some special-case code for restarting (your #3) We already shortcut this process when windows is shutting down: http://hg.mozilla.org/mozilla-central/diff/b5c4b792f3f2/widget/windows/nsWindow.cpp I think B2G should do the same thing: fire these notifications in order and then call into the HAL, bypassing the rest of XPCOM shutdown.

Gene Lian (I already quit Mozilla)

Assignee

Comment 14

•

12 years ago

Hi guys! Thanks for your suggestions. I'll try to follow up with all these concerns (sorry I've been taking 2 days off recently).

Gene Lian (I already quit Mozilla)

Assignee

Comment 15

•

12 years ago

Attached patch Patch (obsolete) (deleted) — Details — Splinter Review

Hi :bsmedberg, Things are simple now. ;) We don't attempt to use the appStartup.quit(). Instead, in order to to synchronize any unsaved data, we fire the following notifications: NS_NAMED_LITERAL_STRING(context, "shutdown-persist"); obsServ->NotifyObservers(nullptr, "profile-change-net-teardown", context.get()); obsServ->NotifyObservers(nullptr, "profile-change-teardown", context.get()); obsServ->NotifyObservers(nullptr, "profile-before-change", context.get()); before calling into hal::PowerOff()/Reboot(). Could you please take a review on this patch? Changes should be trivial.

Attachment #661176 - Attachment is obsolete: true

Attachment #663366 - Flags: review?(benjamin)

Benjamin Smedberg

Comment 16

•

12 years ago

Comment on attachment 663366 [details] [diff] [review] Patch do hal::Reboot and hal::PowerOff work synchronously or asynchronously? Are they unimplemented on desktop B2G builds, for example? I suspect that we should either make sure that they cannot return or that we put an _exit immediately after calling them. We really don't want this process to continue running.

Benjamin Smedberg

Comment 17

•

12 years ago

Comment on attachment 663366 [details] [diff] [review] Patch I'm going to cancel this request for now, please re-request review if the exit behavior is correct.

Attachment #663366 - Flags: review?(benjamin)

Gene Lian (I already quit Mozilla)

Assignee

Comment 18

•

12 years ago

Attached patch Patch 1.1 (obsolete) (deleted) — Details — Splinter Review

Hi :bsmedberg, Thanks for the review. I'm hoping to make both hal::Reboot() and hal::PowerOff() are synchronous so that we don't need to explicitly call the _exit() to leave the process. Hi :cjones, Changes are trivial now if we don't attempt to call appStartup.quit(). What we only need to do is to fire some "profile-change" notifications to save any unsaved data. This will also trigger http://mxr.mozilla.org/mozilla-central/source/toolkit/components/startup/nsAppStartup.cpp#804 to safely close all the windows, launch the QuitWatchdog, and then exit the appshell. However, some tiny changes in Hal are made to ensure hal::Reboot() and hal::PowerOff() are synchronous to address comment #16. To do so, the prefix "sync" needs to be added in PHal.ipdl. Right? Please correct if I'm wrong.

Attachment #663366 - Attachment is obsolete: true

Attachment #664028 - Flags: superreview?(jones.chris.g)

Attachment #664028 - Flags: review?(benjamin)

Benjamin Smedberg

Comment 19

•

12 years ago

It doesn't look like the current HAL implementations are synchronous. In particular, the B2G-desktop builds use http://mxr.mozilla.org/mozilla-central/source/hal/fallback/FallbackPower.cpp#11 which is a no-op. So either all the HALs need to be noreturn, or the process needs to _exit.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 20

•

12 years ago

Comment on attachment 664028 [details] [diff] [review] Patch 1.1 This is only going to work when reboot/shutdown are initiated from the b2g master process. Please assert that these APIs are only used when XRE_GetProcessType() == GeckoProcessType_Default. I don't think we should change the hal:: API here; in fact, given the constraint above, it's best to remove IPC support for hal::Shutdown()/Reboot(). Let's go ahead and do that. This doesn't add a watchdog for the profile sync, but we can do that in a followup bug. Please file.

Attachment #664028 - Flags: superreview?(jones.chris.g)

Gene Lian (I already quit Mozilla)

Assignee

Updated

•

12 years ago

Depends on: 793970

Gene Lian (I already quit Mozilla)

Assignee

Comment 21

•

12 years ago

Attached patch Patch, V2 (obsolete) (deleted) — Details — Splinter Review

Hi :cjones and :benjamin, The new patch is summarized as follows: 1. Strip out the IPC mechanism for hal::Reboot and hal::PowerOff, which means the APIs can only be called from the main process. 2. For the Reboot/PowerOff in FallbackPower.cpp, we use kill(0, SIGKILL) and _exit(1) to leave the process (this is inspired by http://mxr.mozilla.org/mozilla-central/source/toolkit/components/startup/nsAppStartup.cpp#377). 3. Tentatively, add some MOZ_NOT_REACHED() checks to the points which are not visited for sure.

Attachment #664028 - Attachment is obsolete: true

Attachment #664028 - Flags: review?(benjamin)

Attachment #664379 - Flags: superreview?(jones.chris.g)

Attachment #664379 - Flags: review?(benjamin)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 22

•

12 years ago

Comment on attachment 664379 [details] [diff] [review] Patch, V2 >diff --git a/dom/power/PowerManagerService.cpp b/dom/power/PowerManagerService.cpp >+void >+PowerManagerService::SyncProfile() >+{ >+ // TODO We need to start a watchdog thread to force gecko to really >+ // power off or reboot if the profile synchronizing hangs (bug 793970). Nit: people generally write "FIXME/bug 793970" for these kinds of comments. >+ nsCOMPtr<nsIObserverService> obsServ = >+ mozilla::services::GetObserverService(); |services::GetObserverService()|. You don't need the "mozilla::". >diff --git a/hal/fallback/FallbackPower.cpp b/hal/fallback/FallbackPower.cpp > void > Reboot() >-{} >+{ >+ kill(0, SIGKILL); >+ _exit(1); >+ MOZ_NOT_REACHED(); This isn't going to compile on all platforms. Let's hold off on adding this until we refactor the watchdog code. We can have the reboot/poweroff fallbacks continue to be no-ops. The MOZ_NOT_REACHED() calls that are in PowerManager after the hal:: calls, are fine for now. >diff --git a/hal/linux/LinuxPower.cpp b/hal/linux/LinuxPower.cpp We can remove the changes here here. sr=me with that.

Attachment #664379 - Flags: superreview?(jones.chris.g) → superreview+

:Benjamin Peterson

Updated

•

12 years ago

Attachment #664379 - Flags: review?(benjamin) → review?(benjamin)

Gene Lian (I already quit Mozilla)

Assignee

Comment 23

•

12 years ago

(In reply to Chris Jones [:cjones] [:warhammer] from comment #22) > Comment on attachment 664379 [details] [diff] [review] > Patch, V2 > The MOZ_NOT_REACHED() calls that are in PowerManager after the hal:: > calls, are fine for now. Not quite sure about this part. My understanding is to only add MOZ_NOT_REACHED() checks in hal::Reboot()/hal::PowerOff() and remove the ones in PowerManagerService::Reboot()/PowerManagerService::PowerOff() since the hal:: has already done the checks.

Gene Lian (I already quit Mozilla)

Assignee

Comment 24

•

12 years ago

Attached patch Patch, V3 (obsolete) (deleted) — Details — Splinter Review

sr=cjones Hi Benjamin, Based on :cjones' comment #22, we're going to design a watchdog with a timer to compulsively power-off/reboot the gecko if it hangs (Bug 793970), so let's have the reboot/poweroff fallbacks continue to be no-ops for now. If you don't mind, may I have your review+ to check this in? Thanks!

Attachment #664379 - Attachment is obsolete: true

Attachment #664379 - Flags: review?(benjamin)

Attachment #664856 - Flags: superreview+

Attachment #664856 - Flags: review?(benjamin)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 25

•

12 years ago

(In reply to Gene Lian [:gene] from comment #23) > (In reply to Chris Jones [:cjones] [:warhammer] from comment #22) > > Comment on attachment 664379 [details] [diff] [review] > > Patch, V2 > > The MOZ_NOT_REACHED() calls that are in PowerManager after the hal:: > > calls, are fine for now. > > Not quite sure about this part. My understanding is to only add > MOZ_NOT_REACHED() checks in hal::Reboot()/hal::PowerOff() and remove the > ones in PowerManagerService::Reboot()/PowerManagerService::PowerOff() since > the hal:: has already done the checks. The hal:: API is not defined to "noreturn" when calling Reboot()/PowerOff(). We could define it to do that, but I think there might be some value in returning to indicate failure. In the case of this patch, if we only QuitHard() in PowerManager when the hal:: functions *do* return, then we'll change less code. So I prefer that.

Benjamin Smedberg

Comment 26

•

12 years ago

I don't understand comment 24, can you catch me on IRC and help me understand what the watchdog has to do with the _exit call?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 27

•

12 years ago

What I would like us to do here is the following, taking Reboot() as an arbitrary example PowerManager::Reboot() { // sync profile hal::Reboot(); MOZ_NOT_REACHED(); } on shipping platforms where PowerManager::Reboot() can be called, hal::Reboot() should never return. If it does, there's an extremely serious bug. I would prefer to use MOZ_NOT_REACHED() instead of _exit() so that we have the chance of getting a breakpad report if this case happens.

Gene Lian (I already quit Mozilla)

Assignee

Comment 28

•

12 years ago

Attached patch Patch, V3.1 (obsolete) (deleted) — Details — Splinter Review

Hi Benjamin, I think it's my bad to take the watchdog thing into considerations. Let's focus on the PowerManagerService::Reboot()/PowerOff behaviors within this issue. Please see comment #27. We're hoping to use MOZ_NOT_REACHED() instead of _exit() to specify the unexpected hal returns. sr=cjones

Attachment #664856 - Attachment is obsolete: true

Attachment #664856 - Flags: review?(benjamin)

Attachment #665248 - Flags: superreview+

Attachment #665248 - Flags: review?(benjamin)

Benjamin Smedberg

Comment 29

•

12 years ago

Comment on attachment 665248 [details] [diff] [review] Patch, V3.1 I still don't understand why this won't crash B2G desktop builds, where hal::Reboot is a no-op. Can you just add _exit to those methods instead of making them a no-op?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 30

•

12 years ago

We don't call this reboot() method on desktop. If we attempt to call it from PowerManager, then it's a logic error and we want to see the assertion failure.

Benjamin Smedberg

Updated

•

12 years ago

Attachment #665248 - Flags: review?(benjamin) → review+

Gene Lian (I already quit Mozilla)

Assignee

Comment 31

•

12 years ago

Attached patch Patch, V3.2 (deleted) — Details — Splinter Review

Rebase. Thanks Cjones and Benjamin for the reviews! r=bsmedberg,sr=cjones

Attachment #665248 - Attachment is obsolete: true

Attachment #665804 - Flags: superreview+

Attachment #665804 - Flags: review+

Gene Lian (I already quit Mozilla)

Assignee

Comment 32

•

12 years ago

Try server: https://tbpl.mozilla.org/?tree=Try&rev=80d14902e98f

Gene Lian (I already quit Mozilla)

Assignee

Updated

•

12 years ago

Keywords: checkin-needed

:Benjamin Peterson

Comment 33

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/d31d2479d685

:Benjamin Peterson

Updated

•

12 years ago

Keywords: checkin-needed

Phil Ringnalda (:philor)

Comment 34

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/d31d2479d685

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla18

You need to log in before you can comment on or make changes to this bug.