Closed Bug 181230 Opened 22 years ago Closed 22 years ago

profile-change-net-teardown must work

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.3beta

People

(Reporter: KaiE, Assigned: darin.moz)

References

Details

(Keywords: topembed+)

Attachments

(1 file, 4 obsolete files)

test patch 22 years ago Darin Fisher (deleted), patch		Details \| Diff \| Splinter Review
AIM conversation with kai 22 years ago Darin Fisher (deleted), text/plain		Details
v1 patch 22 years ago Darin Fisher (deleted), patch		Details \| Diff \| Splinter Review
v1.1 patch 22 years ago Darin Fisher (deleted), patch		Details \| Diff \| Splinter Review
v1.2 patch 22 years ago Darin Fisher (deleted), patch	dougt : review+ darin.moz : superreview+ asa : approval1.3a+	Details \| Diff \| Splinter Review

Kai Engert (:KaiE:)

Reporter

Description

•

22 years ago

Working on bugs 177260 and 97622 I ran into an additional problem.
I thought I had a good workaround, but the workaround requires direct
manipulation of the lower NSPR data structures from outside the socket transport
thread, and that is possibly dangerous.

If, at the time we need to shut down PSM/NSS in order to switch profiles, there
are still sockets alive, we have a real problem.

Because the SSL library is layering itself on the file descriptor stack, the
resources that SSL is holding into NSS can not be freed until the socket is closed.


Actual behaviour:
After the profile-change-net-teardown phase is over, there are still open sockets.

Requested behaviour:
During the profile-change-net-teardown phase, all open sockets must be closed
synchronously.

Darin Fisher

Assignee

Updated

•

22 years ago

Severity: normal → major

Status: NEW → ASSIGNED

Keywords: topembed

Priority: -- → P2

Target Milestone: --- → mozilla1.3alpha

Darin Fisher

Assignee

Comment 1

•

22 years ago

Attached patch test patch (obsolete) (deleted) — Details — Splinter Review

this patch ensures that any active connections will not be added as idle
connections after the net-teardown event fires.

Darin Fisher

Assignee

Comment 2

•

22 years ago

Attached file AIM conversation with kai (deleted) — Details

kai says the "test patch" doesn't help.

Kai Engert (:KaiE:)

Reporter

Comment 3

•

22 years ago

Darin, in your test patch you try to modificate the http handler. I would have
expected that the problem lies somewhere else. I see that mail is also not
behaving correctly and stops us from switching profiles, because SSL sockets
have not been closed.

Are you able to tear down all sockets the hard way in the socket transport code,
regardless where they come from?

Darin Fisher

Assignee

Comment 4

•

22 years ago

kai: my assumption was that socket transport shutdown was happening, but that
some connections were becoming IDLE (on a background thread) at the same time
that we were processing the net-teardown event.  i think my patch closes this
hole, but obviously there is something else going on as well.  i'm hoping to
spend time on this today.

Darin Fisher

Assignee

Comment 5

•

22 years ago

Attached patch v1 patch (obsolete) (deleted) — Details — Splinter Review

this patch fixes the "going offline while security dialog is visible" problem.

Attachment #106996 - Attachment is obsolete: true

Darin Fisher

Assignee

Comment 6

•

22 years ago

kai: can you please give this latest patch a try and see if it fixes all of your
testcases?  thx!

Kai Engert (:KaiE:)

Reporter

Comment 7

•

22 years ago

Sorry, this patch does not change the behaviour, I still see the same problems.

I would have assumed there is only one approach than can help us:
When we receive the call to SetOffline, go to the socket transport code,
synchronously iterate over the list of all sockets known by it, call close on
each, and do not return from the notification until that is done.

But that is probably dangerous, because other code has still references to the
socket.

So I suggest, instead of closing the files, which invalidates the file
descriptors and makes other components crash: Use the hack that I proposed in
bug 177260, but which you found to be unsafe when done from outside the socket
thread.

So the new approach might be:

- SetOffline received
- iterate over all sockets
- for each socket, do the following hack:
  - use PR_NewTCPSocket to create a new unconnected socket
  - swap all the contents of the socket filedescriptor structure
    between the old and the hack socket
  - close the fd that is now no longer referenced by others

Darin Fisher

Assignee

Comment 8

•

22 years ago

> I would have assumed there is only one approach than can help us:
> When we receive the call to SetOffline, go to the socket transport code,
> synchronously iterate over the list of all sockets known by it, call close on
> each, and do not return from the notification until that is done.

that is what the code tries to do.  it cancels all of the socket transport
requests, which must be processed on the socket transport thread.  the thread
which shutdown the socket transport service (the main thread) must wait until
the socket transport thread exits (it does a join).  this makes the call to
shutdown the socket transport synchronous.

kai: what i need is a testcase.  i used the testcase of going offline while the
security dialog is showing to arrive at this patch.  is that what you were
testing?  before my patch i noticed PR_Close being called after the security
dialog is closed, but with my patch PR_Close is called right away.  please
describe the steps you used to repro the problem.  thx!

Kai Engert (:KaiE:)

Reporter

Comment 9

•

22 years ago

The testcase is:

- add printf statements as the first line to nsNSSIOLayerClose and
nsSSLIOLayerConnect in file nsNSSIOLayer.cpp

- compile in optimized mode (I tested on Linux)

- open two browser windows

- in one window, open https://meine.db24.de - wait for the security window and
leave it open

- watch the console, you'll see the output from the open function

- in the other window, click the statusbar offline icon


Expected behaviour: The SSL close function should have been reached as soon as
you clicked the offline icon. As a result, the console log should show the
printf output from the close function.

Actual behavior: No output on the console.



Another testcase:
- Configure mail to use IMAP / SSL.
- Configure edit/privacy/certificates to "ask every time"
- Open a mail and a browser window
- in one window, connect to the server
- should it prompt to select a certificate, cancel
- you should see the prompt to enter your mail password
- with the password prompt dialog open, switch to the browser window
- click the offline icon.

Expected: close is reached immediately.
Actual: Nothing happens.

Kai Engert (:KaiE:)

Reporter

Comment 10

•

22 years ago

> i used the testcase of going offline while the
> security dialog is showing to arrive at this patch.  is that what you were
> testing?  before my patch i noticed PR_Close being called after the security
> dialog is closed, but with my patch PR_Close is called right away

Hmm, I can not confirm that, on my Linux system, close is not reached right
away. It is not reached until I confirm the security warning dialog.

Darin Fisher

Assignee

Comment 11

•

22 years ago

Attached patch v1.1 patch (obsolete) (deleted) — Details — Splinter Review

OK, this patch is similar to the other patch in that it only attempts to fix
the problem for HTTPS.

Attachment #107248 - Attachment is obsolete: true

Darin Fisher

Assignee

Comment 12

•

22 years ago

Attached patch v1.2 patch (obsolete) (deleted) — Details — Splinter Review

this one should really do the trick for HTTPS.

Attachment #107379 - Attachment is obsolete: true

Kai Engert (:KaiE:)

Reporter

Comment 13

•

22 years ago

This latest patch does indeed work for the https case. (Although it does not yet
work for mail/ssl, as discussed).

Kai Engert (:KaiE:)

Reporter

Updated

•

22 years ago

Blocks: 182803

Darin Fisher

Assignee

Updated

•

22 years ago

Attachment #107386 - Flags: superreview?(rpotts)

Attachment #107386 - Flags: review?(dougt)

Michael Buckland

Comment 14

•

22 years ago

Discussed in edt.  Plussing because we need this for profile sharing.

Keywords: topembed → topembed+

Kai Engert (:KaiE:)

Reporter

Comment 15

•

22 years ago

Darin, is your intention to implement the general solution (which would include
tearing down mail ssl sockets) within the scope of this bug?

Darin Fisher

Assignee

Comment 16

•

22 years ago

Kai:

yes, it is.  but i want to get this HTTP patch in because i think it is anyways
a good idea, and it might even fix a topcrash.

Kai Engert (:KaiE:)

Reporter

Comment 17

•

22 years ago

Ok, thanks!

rpotts (gone)

Updated

•

22 years ago

Attachment #107386 - Flags: superreview?(rpotts) → superreview+

Doug Turner (:dougt)

Updated

•

22 years ago

Attachment #107386 - Flags: superreview?(rpotts)

Attachment #107386 - Flags: superreview+

Attachment #107386 - Flags: review?(dougt)

Attachment #107386 - Flags: review+

Darin Fisher

Assignee

Comment 18

•

22 years ago

Comment on attachment 107386 [details] [diff] [review]
v1.2 patch

seeking drivers approval to check this into the trunk for moz 1.3 alpha.  this
patch not only helps fix this bug for HTTPS connections, but it might also
solve topcrash bug 179391.  i think this patch has moderate risk.

Attachment #107386 - Flags: superreview?(rpotts)

Attachment #107386 - Flags: superreview+

Attachment #107386 - Flags: approval1.3a?

Asa Dotzler [:asa]

Comment 19

•

22 years ago

Comment on attachment 107386 [details] [diff] [review]
v1.2 patch

a=asa for checkin to 1.3a

Attachment #107386 - Flags: approval1.3a? → approval1.3a+

Darin Fisher

Assignee

Comment 20

•

22 years ago

landed patch on trunk.  next, i'll look into how to fix this in general. 
hopefully this patch will help with the topcrash shutdown bug.

Kai Engert (:KaiE:)

Reporter

Comment 21

•

22 years ago

Nominating to block 1.3b, because not having this fix results in crash bug 182803.

Flags: blocking1.3b?

Kai Engert (:KaiE:)

Reporter

Updated

•

22 years ago

Blocks: 187501

Kai Engert (:KaiE:)

Reporter

Comment 22

•

22 years ago

Comment on attachment 107386 [details] [diff] [review]
v1.2 patch

In the past, when people landed patches on the trunk, but decided to keep the
bug open, I saw they marked the landed patches as obsolete to minimize
potential confusion. I think that's a good idea. Marking obsolete.

Kai Engert (:KaiE:)

Reporter

Updated

•

22 years ago

Attachment #107386 - Attachment is obsolete: true

Darin Fisher

Assignee

Comment 23

•

22 years ago

should be fixed by my patch for bug 176919.

Depends on: 176919

Target Milestone: mozilla1.3alpha → mozilla1.3beta

Kai Engert (:KaiE:)

Reporter

Updated

•

22 years ago

Blocks: 188558

Darin Fisher

Assignee

Comment 24

•

22 years ago

FIXED with patch from bug 176919.

Status: ASSIGNED → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

Asa Dotzler [:asa]

Updated

•

22 years ago

Flags: blocking1.3b?

Ere Maijala (slow)

Comment 25

•

22 years ago

*** Bug 188558 has been marked as a duplicate of this bug. ***

benc

Comment 26

•

16 years ago

V/fixed.

This has worked for some time. I haven't tested it with a set of unit cases, since profile changing is available only in seamonkey UI. 

I'd like to do this in the future as an high-level integration test, maybe as an extension of the networking-core functional test I wrote (which pokes a lot of network protocols at once). Bother me if you care.

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.