Closed Bug 75947 Opened 24 years ago Closed 23 years ago

Security DLLs are now loaded at startup, hurting startup time

Categories

(Core Graveyard :: Security: UI, defect, P4)

1.0 Branch
defect

Tracking

(Not tracked)

VERIFIED FIXED
psm2.2

People

(Reporter: sfraser_bugs, Assigned: KaiE)

References

Details

(Whiteboard: nsenterprise)

Attachments

(5 files, 6 obsolete files)

(deleted), text/html
Details
(deleted), patch
KaiE
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
javi
: review+
Details | Diff | Splinter Review
(deleted), patch
javi
: review+
Details | Diff | Splinter Review
With the PSM 2 landing we started loading the security DLLs PIPNSS and NSSckbi at startup. They get pulled in by nsGlobalWindow's ctor, which does a GetService of the nsIEntropyCollector. Loading these two DLLs is going to hurt startup time somewhat.
Security:Crypto is the component for PSM issues.
Assignee: mstoltz → ddrinan
Component: Security: General → Security: Crypto
QA Contact: ckritzer → junruh
On my 19.8 second win32 startup (after a reboot) PSM2 accounts for 0.7 seconds, about 3.5% slowdown.
NSS is also eargerly loading a string bundle on startup.
Moving to Component "PSM".
Component: Security: Crypto → Client Library
Product: Browser → PSM
Target Milestone: --- → 2.0
Version: other → 2.0
I also set the target to PSM 2.0 in the hopes that we can make the mozilla 0.9.1 release.
Blocks: 71781
I believe that the proposal on the table is to break entropy collecting into its own DLL which will get loaded at launch time. Then the rest of PSM will get loaded when called for the first time. I'll let ddrinan comment further.
Before PSM 2.0, the entropy collector was loaded as well (meaning psm-glue was loaded). So if we can make it such that only pipnss.dll is loaded, then we should be about the same as PSM 1 days. What's hurting start-up is the initialization of NSS at the same time the entropy collector is loaded. IMO, if we can delay the initialization of NSS until SSL or a cert action is required the start up time would be on par with pre-PSM 2 days. Breaking the entropy collector into its own dll won't do much to solve the problem because you're still loading another module. But making the entropy collector not initialize NSS right away will.
I'd also be happy with just delaying kicking all this off by about 60 seconds with a timer (or something like the category service.) It might be the easier path and I don't think 60 seconds less entropy is going to hurt. No recent comments in this bug, is it going to be fixed this week?
I'm finishing off PSM feature work. I hope to get to this by friday (5/11).
Did it happen on Friday? :-)
Adding thayes to cc-list. After discussing the various proposals with other members of the NSS team, we've come to the conclusion that in order to reduce start up time while *not* reducing the amount of entropy collected, the following criteria must be met: - start entropy collection when browser loads and buffer the entropy. - delay initialization of NSS. - when NSS has be initialized (SSL, SDR, keygen etc), feed it the stored entropy. - contine to feed NSS entropy as it is collected. Since the amount of engineering work is much greater than anticipated, we should talk tomorrow and figure out what the best course of action is. While investigating this bug, we are of the opinion that NSS is not being fed enough entropy by the browser and that other sources of entropy should be added (e.g. mouse moves). I'll open a bug on this. Also, NSS should be initialized once and only once during the lifetime of the browser (with the one exception of profile switching) because each time NSS is initialized, it loses all of its previous entropy and other state information. I'll open a bug on this also.
We've had a number of meetings on this matter. The basic problem is that we need to start collecting entropy as soon as the client launches, and the proposals for delaying or changing the startup sequence would further reduce the amount of entropy we have in the system. As you know, we need to feed a good deal of entropy into the PRNG system in order to make sure that we have enough randomness for operations like SSL, keygen, and the encrypted wallet feature. Without a good source of randomness, a long symetric key of 128-bits might only be worth a few bits. (That would be *REALLY* bad, btw) Some of you may recall a prior incident where the browser did not have enough entropy for SSL connections. Those of you who don't remember the matter can read about it in Steven Levy's new book "Crypto". :-( As a result of these meetings, I don't think we can both collect and use the right amount of entropy *and* change the startup sequence. It's not impossible, but it will require some non-trivial amount of work that we haven't scheduled. Target -> PSM 2.1
Target Milestone: 2.0 → 2.1
Blocks: 77757
Keywords: nsenterprise
remove nsenterprise
Keywords: nsenterprise
Whiteboard: nsenterprise
Changing platform
Hardware: All → Macintosh
OS: Mac System 8.5 → All
Hardware: Macintosh → All
P2
Priority: -- → P2
P4
Priority: P2 → P4
*** Bug 91235 has been marked as a duplicate of this bug. ***
Moving all P3 and P4 bugs targetted to 2.1 to future.
Target Milestone: 2.1 → Future
start time is one of the key performance problems that still need to be addressed. please find a way to bump the priority back up on this one and move the milestone back in or risk the rath of my 5-iron...
Mass assigning QA to ckritzer.
QA Contact: junruh → ckritzer
Depends on: 96058
The fix for bug 76915 (which loads security components from nsAppRunner.cpp) makes this bug a good bit harder to fix.
Now that bug 76915 is fixed, is this bug still relevant?
QA Contact: ckritzer → junruh
Version: 2.0 → 2.1
No, the fix for bug 76915 makes this a lot worse and should probably be removed and redone in a better way (and one that doesn't prevent the entire app from working if PSM is messed up).
Changing my e-mail address.
Removing old e-mail address.
our profiling data showing nss_init() is showing 5% cost of time on startup. we need this bug fixed ASAP.
Target Milestone: Future → 2.2
come talk to brendan and chofmann about this :-)
Can we see the actual profiling data. The bug says the problem is in the size of the library itself, cathleen indicates that it's NSS_Initialize() itself. If we have good accurate profiling info, we can get a better idea on what the problem is and design ways around it. Thanks, bob
One of the things you do right off the bat is read the cert7.db, which if you've got a migrated profile with gobs of 4.x SMIME certs can be quite large (mine is nearly 3Mb). Do you really need to read all these right away just to start collecting entropy?
Both the loading of the dlls and the initialization are problems, and both could be delayed. nss_Init alone was 5.3% of the time (excluding polling) in a jprof profile (and loading the NSS string bundle was a bit more), which is timer based but may miss some of the time spent in loading DLLs. I'll attach a sub-profile showing the time spent within nss_Init. The total number of non-polling timer hits was 13142, this profile covers 694, or 5.3%.
Bingo! Thanks.... The profile makes it very clear the massive amount of NSS startup time is reading all the built-in certs into the temparary DB. This is something we're changing in NSS 3.4, where all the certs aren't unloaded from a token at startup time. So the good news is this shouldn't be a problem for Mach5. The bad news there is not safe fix in the near term. bob
cc lord. I'll let Bob Relyea's comments stands.
Loading the libraries is still an issue, though.
At least one library needs to get loaded immediately in order to start accumulating input for seeding the cryptographic random number generator. This function (accumulating data) could be moved into the core libraries or modules, however nobody has volunteered to do this. If it's not done in the core, a PSM component library will need to be loaded. Is there an advantage to loading a small library instead of a larger one (in terms of start-up time)?
Maybe the DOM code that sends mouse event entropy could start with the 100th event rather than the 1st event (since it only sends one out of every hundred), so we can get a window up without loading the security DLLs?
It should not be necessary to load the security libs to collect entropy. Collection can be decoupled from consumption (that is, from feeding the collected entropy into the PRNG). See my comments dated 2001-05-16 13:45 in bug 80841 to see how PSM 1.x did this.
when is 2.2 milestone for PSM?
The next code name.
I'll attach a patch that removes the hack from nsAppRunner.cpp and replaces it with a hack in the NSS module and provides the architecture for an entropy collector that is not in the NSS module. The gap in this patch is that I'm not sure of the correct algorithm for accumulating the entropy. That needs to be filled in. However, with this patch, there's still something that's loading the library from JS code.
It's probably creating the "secure browser ui" object for the window. This object determines the security status of the window (secure, mixed, insecure), sets the state of the lock icon, and displays warnings about posting forms to insecure sites. It also provides the data that is used for displaying the security status of the window in "Page Info".
We could probably move nsSecureBrowserUIImpl out of libpipnss, couldn't we? mozilla/xpfe/browser seems like a logical place to put it. It doesn't depend on any NSS code, and no PSM code aside from throwing the dialogs for transitions between secure and non-secure sites.
Does that mean we'll have to move the definition of said dialogs (which are probably defined in manager/ssl/public)?
Javi, currently nsSecureBrowserUIImpl uses a direct function call to getNSSDialogs to request the pointer on which to execute the dialogs. But the obtained pointer is already defined as an interface. If we move this function call to an interface, could this be sufficient to move nsSecureBrowserUIImpl out of the NSS component?
1) I answer my own question. It seems to work. I'm attaching a patch to move the file over as bryner suggested. It is not yet a perfect patch, but is sufficient for testing whether moving this over reduces startup time. One thing I have not yet cared for is the string bundle. The moved source file only requests one single string from the bundle, so this string could easily be moved over, too. To test this patch, first apply it, and then "cd" to the mozilla directory, and execute: mv security/manager/ssl/src/nsSecureBrowserUIImpl.* xpfe/browser/src/ dbaron, do you want to test this? Or can I help? 2) I looked at your patch. One thing that you have left out (and that we should re-add if we take your patch) is the error message + aborting code, that exits Mozilla if the security component can not init the certificate databases.
Not starting the entire browser if the certificate database is corrupt is silly. My patch makes it so we can start the browser, but just not use any security code. That is sufficient to prevent the crash. If you'd like to add UI for that as well, go ahead, but please don't make it so that it's impossible to start the browser, and please don't do it by gross violations of modularity.
dbaron: i have a patch somewhere that enables mozilla w/ psm to run w/ a readonly security db because last i checked writability was a nice and fatal requirement :)
kaie, doesn't this create a build-time dependency on the nsINSSDialogs interface?
dbaron: Not starting the browser may seem "silly" to you, but that code was inserted to deal with a top crasher that was reported via Talkback and we got lots of pressure to deal with. Before you go ripping out code, please ask to find out why it was put there in the first place, and when appropriate replace it with code that provides the same functionality.
I know that. It *was* a silly fix for the given problem, and my patch still fixes that problem.
See bug 76915 for background on terminating the browser when the db is corrupt. A profile with a corrupted database is pretty useless. How long will it take for that user to run into serious problems? What will the failure modes of a browser session with no security (hangs, page not loading, what?) how many bugs would this generate? how many calls to a support center? How easy would it be to associate whatever problem is reported to the fact that the db cannot be open? At least now, the failure mode is easily documented, that is to say, supportable. The chosen fix should address this issues.
It will act exactly as the browser does when security isn't installed. That's a legitimate mode of operation that we support.
Mozilla supports embedding customers not including the security libraries. When the security libraries are part of the product (e.g., the Mozilla browser) and something happens that disables them, that's different. The patch to bug 76915 addressed that situation and should stay in.
Of course, your fix to bug 76915 doesn't work for any embedders that do include the security libraries, since it's breaks modularity in a way that the system was not designed to work.
If I start Mozilla and my global history database is corrupt, the program should not abort. If Mozilla finds a corrupted cache file, it should not abort. If it finds a bad BSD mailbox file, it should carry on as best it can. Likewise for the cert db. How much pressure was on someone to fix a topcrash notwithstanding! No design by emergency bug-avoidance hacking, please. /be
Commenting on the patch: Why is there no user feedback as to the fact that security initialization failed? If the code is running, then the product intends for security to be available. Up until now we've *always* opened certificate databases read/write and failure to do so was considered an error. The alert is a way to provide feedback to the user that something is wrong so that a user can tell the help desk technician "The browser says security failed to initialize." This is a big win when debugging problems and we should continue to provide users with such error messages. If embeders want the ability to run with read/only or no cert db's on disk then we should provide a way for the embeders to specify that. (File an RFE and it'll get prioritized.) We shouldn't just continue as if nothing is wrong if NSS failed to initialize in the manner we've been doing since Comm 4.x days. This can lead all products to have a false sense of security, which IMHO would be bad. At the least we need to let the user know something is wrong, so the poblem can be fixed.
I think I now understand what David says. I haven't tried it yet, but looking at his code, the following is what he did: David changed the way the instantiation of PSM objects works. The standard code (used by any other component) just creates an instance. David suggests to use a changed version for our module. He suggests to let instantiation happen conditionally. The first time any PSM object is requested, a test code is executed, and the result of the test is remembered. This test tries to create the nsNSSComponent singleton, which tries to init NSS. If something goes wrong during the first test, the error condition will be remembered. For all requests, by any part of the application, for PSM services, the security module will return a failure. David also made sure that both of our libraries, pki and nss, both are either working or nonworking. This means, if we can not init NSS, the application would run as if no security were linked with the application. This is much more clever solution than what I did in bug 76915. We are free to add a GUI notification. We need to decide whether we still need the low level GUI routine for user feedback, or whether we consider it now unlikely that the failure happens in a very early stage of the application - where XUL is not yet available (as we saw in bug 76915). But if move the entropy collector out of the security library, it indeed seems unlikely that XUL is not available when the first problem shows up. We are free to add an error message to the constructor of NSS-Component - which will be shown once only. Davids patch is part of the work of solving this particular bug. If we want to delay loading of the security component, we no longer know when the first PSM instance will be created. And with his patch he solves that problem.
> This is a big win when debugging problems and we should continue to provide > users with such error messages. Fine, but do it from within your module rather than cluttering up nsAppRunner.cpp, which is designed to start the app (and only Mozilla the browser, not any embedded apps), not to be a dumping ground for any code anybody feels like putting there.
taking
Status: NEW → ASSIGNED
hmm, sorry for the spam, I thought clicking assign would reassign it to me...
Assignee: ddrinan → kaie
Status: ASSIGNED → NEW
I did! bug 105167 2001-10-17 kaie@netscape.com NEW Client L All junruh@netscape.comMozilla w/ NSS won't start w/ Readonly profile There's even a patch.
I will work on a patch that puts all pieces together. The entropy collection will be moved out of the security components completely. David suggested to put it into the dom directory, so be it. To gather the entropy I will use the following strategy (unless somebody objects). The entropy collector will be used immediately. It will use an internal cyclic cache of entropy collected so far. I will extend the interface of EntropyCollector with the new function: RegisterEntropyConsumer. As soon as the security component is loaded, it will run its init code. During that code, the entropy collector service is accessed, and the security component registers itself as a consumer. For now, to simplify things, we will not support multiple consumers, but only one. Once a consumer is registered, other registrations will be rejected. (The security component will unregister itself when it is unloaded, to allow for later re-registration, in case this will ever be necessary.) The method RegisterEntropyConsumer will pass over the contract id that should be used to access the consumer. The collector will use GetService and cache the pointer to the consumer. Immediately the collector will flush its cache, by invoking a method on the consumer to hand over all entropy collected so far (that is still in the cache). Question: How many bytes should the entropy collector use for its cache? This should be the number of bytes wanted by the security component immediately, and will define the additional memory usage before the security component is loaded. Once the collector has a registered consumer, it will no longer use a cache, but free the memory, and pass over all collected entropy immediately.
Status: NEW → ASSIGNED
These comments are in response to the question asked just above. PSM 1.0 had a very good design for this. I suggest you follow that design where it makes sense. 4 KB entropy buffer. If more than 4KB of data gets put into it, then it wraps around to the beginning of the buffer again. It doesn't just overwrite the old data with the new. It combines them like so: static unsigned char buf[4096]; static unsigned int in; static unsigned long totlen; update(unsigned char * src, unsigned int len) { totlen += len; while (len > 0) { unsigned int old = buf[in]; buf[in] = ((old << 1) | (old >> 7)) ^ *src++; in = (in + 1) & 4095; } } When the consumer starts, it is given MIN(in, totlen) bytes from buf. The buffer then starts over, e.g. in = totlen = 0; > Once the collector has a registered consumer, it will no longer use a cache, > but free the memory, and pass over all collected entropy immediately. In PSM 1.x, even after the consumer is started, input data continues to be buffered. Data is only given to the consumer when PSM is going to do something that might need new random values (like start a new SSL connection, or sign or encrypt a new message). This improves efficiency, fewer function calls per byte consumed on average.
Attachment #55670 - Attachment is obsolete: true
Attachment #56710 - Attachment is obsolete: true
Attached patch Suggested fix (obsolete) (deleted) — Splinter Review
This patch should fix this bug. It uses dbaron's strategy to delay loading at the module level. I extended the buffering entropy collector, using Nelson's strategy. In one point I did not yet follow Nelson's suggestion. My understanding is: With Nelson's strategy (to buffer always), the security component must be actively requesting entropy. Currently, PSM doesn't do that. It just forwards entropy to NSS using PK11_RandomUpdate as it becomes available, but never actively requests entropy for consumption. Therefore we disable buffering (and switch to forwarding) as soon as PSM is loaded. I added a warning message to the constructor of the NSS component. Once this gets loaded, and initializing NSS fails, a warning will be shown to the user. We might want to enhace this warning text later. As a base I used the text we already had in AppRunner, and added a statement that it is recommended to quit the browser and fix the problem. But if we fail, the application will continue to run. The warning message is: "Could not initialize the browser's security component. The most likely cause is problems with files in your browser's profile directory. Please check that this directory has no read/write restrictions and your hard disk is not full or close to full. It is recommended that you exit the browser and fix the problem. If you continue to use this browser session, you will see incorrect browser behaviour when accessing security features." I removed the additional files for the AppRunner warning text. I moved over nsSecureBrowserUI to the dom module, the same place where the buffering entropy collector lives. I split two interface files. Only two small interfaces had to move to dom, and mozilla/dom/src now depends on uriloader. I tested that the buffering and forwarding code works. I tested that building works with --disable-crypto, too. There is one drawback, that I think is acceptable temporarily, but should be fixed soon (that's a new bug): PSM uses XUL overlays that are hooked into the preferences. These are loaded and displayed to the user, even if PSM can not load. Basically we have now broken functionality. The controls are shown, but they are shown incorrectly, not filled with content, and behave incorrectly. I guess we can not prevent that those preferences are shown. If we can, please tell me how to :-) If we can't, we could do the following: Let's disable all the controls initially in XUL. We might extend the onLoad handlers for every overlay. There we try to access the PSM service. If that doesn't work, we keep the controls disabled. If it works, we enable the controls to the initially wanted state. The good thing is that I didn't see any crashes. When PSM is not available, security is just not available. I only saw JavaScript exceptions caused by the preferences JS code. Note that files nsSecureBrowserUI.h and .cpp are included twice in the patch for easier reviewing. Once completely at the new location. Another time at the old location showing only what was changed in the files. Who can review this patch?
adding patch, review keywords
Keywords: patch, review
I found another issue that applying this patch will create. The constructor of nsNSSComponent registers content listeners for 4 mime types in order to support downloading and importing certificates from websites. If a user tries to download a certificate (for example in preparation to accept a new Certification Authority) before PSM has initialized, the mime type will be unknown and the user will be prompted to save the certificate to disk (which is not what is wanted). If I understand this correctly, to make it work in that case, the downloading of certificates has to be re-implemented using nsIContentHandler and registered with NS_CONTENT_HANDLER_CONTRACTID_PREFIX"mimetype" contract ids (See also bug 108600).
I actually need to make this bug dependent on bug 108600. On my system, if I apply this patch, Mozilla crashes when I try to download a certificate from a website, prior to loading PSM. My patch in bug 108600 fixes the problem.
Depends on: 108600
Comment on attachment 57094 [details] [diff] [review] Suggested fix Make sure licenses for new files are correct. Get someone on dom team to review dom changes. Other than that: r=javi for security changes.
I *really* don't like the idea of having all this non-DOM related code in the DOM library, I'm happy to provide whatever hooks it takes to make this work as we want it to work, but moving this code into the DOM seems just wrong. Would it not make more sense to create a little static library in the security code and make the DOM link statically with that library?
jst told me Mozilla does already use this strategy in mozilla/content/shared, which builds a static lib, and gkcontent.dll and gklayout.dll links with the static shared lib. I will look at the example and create a new patch.
Depends on: 109777
Javi asked me to test whether obtaining a certificate with this patch still works. While testing, I found bug 109770, which blocks this one.
Depends on: 109770
When considering the library structure, remember that PSM is (currently) an optional (extension) module. Currently, NOTHING in the PSM area is built if BUILD_PSM2 is not set. That would include any static library and header files that you put there. There is no way to avoid putting some amount of code (or interfaces) in the base modules. If it's not in the DOM code, then fine, but it needs to live someone that is always included in the build.
jst: I looked at the mentioned example shared library. It does not implement any interfaces, but we need to do that in our code. I think we would have to add some code to your module anyway, at least to nsDOMFactory.cpp. Your wish would make your and our code more complicated, and we don't have a good example in the Mozilla code already lying around, but would have to invent something new. But basically I dislike the idea of inventing new hooks. We already have a method to use hooks, that's XPCom. But I would like to avoid adding yet another XPCom module. In my opinion, having many XPCom modules has a negative impact on the overall application overhead and should be avoided, too. If we are required to load one security related XPCom module during startup, what about the following suggestion: Currently, security does already use two different modules. What about moving all stuff that is required initially to the module PKI? That would mean, one small security module will be loaded during startup. But it will not be the the NSS module, it will be the smaller PKI module. dbaron: Do you think it would fix this bug if we only load the PKI library immediately, and move the code from the patch to PKI instead to DOM? To mention some numbers: Currently PKI module has about 10% of all security code. Not counting NSS, which reduces this number even more.
Do not add any implementations to the pki module that will force it to *always* be loaded. The purpose of the pki module is to provide default UI for security related tasks and nothing else. Embedders will implement the subset of UI operations they care about and I'm afraid that adding interfaces to pki that force it to get loaded every time will cause undesired conflicts with the implementations of embedders.
Thanks, Javi. I think that means we have to go with a 3rd XPCom module for security, that will (for now) only contain the code that we intended to move to dom. This means, the new small security library will always be loaded during startup, the other two libraries on demand. If you think, this is not acceptable, but a solution is sought where no security library is loaded at all during startup, please object now.
So we're going to create a brand new XPCOM interface to do that? XPCOM is one of the reasons why the application has performance problems. We're fixing this bug for performance reason as the request of the performance group. Could we revisit the options of having the code in the DOM code? Just "seems wrong" may not be sufficient reason to go to all this trouble. jst: can you elaborate why this is so bad to have this bit of code live in the DOM? Performance folks: if we're going to fix this for performance reasons, shouldn't we avoid adding bloat based on a code location issue. What we're trying to do is to get executable code executed at the proper location in the life of the application. In my humble opinion, achieving this at the lowest engineering cost, and in the most performance/footprint efficient manner may very well supercede objections at to where the text code lives. The end user doesn't care about the latter. My 2c.
We do need to make sure to preserve modularity where it makes sense. However, I would say since the dom module already provides the nsIEntropyCollector interface, it would make sense for it to provide a "stub" implementation.
Please see also discussion in bug 105167, which discusses whether automatic read/only mode for NSS should be used in case of unavailable read/write mode. This can influence the patch we create in this bug. It would simplify it a little, as the trick to "disable PSM from loading if NSS can not initialize" would become obsolete. This trick would not work for the "switch profile during runtime" case, where NSS needs to be re-initialized. However, all the other above questions still apply.
I will re-implement this using a new 3rd XPCom module.
Fixing this bug will even have a greater impact than I thought. I saw that currently, if one has at least two profiles, NSS is actually initialized twice. Init - profile selection box shows up - shutdown - init again. Once we fix this bug, both inits will be avoided during startup. However, as NSS init takes some time, the same destruct and init sequence might happen again at later times. I wonder whether we should add code that makes sure NSS is at most initialized once per session (at least not destructed until the profile is changed). To make sure the NSS service does not get destructed, because currently nobody holds a reference to it, should we make NSS increment its own reference count? As a sidenote, I wonder whether this applies to other global services as well, being created and destroyed multiple times, hurting performance.
Blocks: 99566
I had a chat with Conrad Carlen. He said during profile creation, unloading/loading of NSS will happen even more times, around 4 times. Anyway, even if it is only two times it is two much I think. I will use the trick, to use NS_ADDREF_THIS in the constructor of nsNSSComponent. This instance can live for the whole session. In "turbo mode", it can live between profile switches, too, and will care on its own for re-initializing of NSS (as implemented in bug 99566). This imposes the risk that we leak on shutdown. Actually we are already doing this, caused by code in nsNSSIOLayer.cpp (InitNSSMethods). I will remove that other leaking reference and just make sure that NSS has already been initialized. But we should avoid the leak and make sure we actually clean up on shutdown. That includes making sure that NSS_Shutdown is actually called when we exit the application. We don't do that currently. To accomplish this, my idea is to register the nsNSSComponent with the observer service and watch for an application shut down event. We we see that event, we call NS_RELEASE_THIS. Conrad suggested the event NS_XPCOM_SHUTDOWN_OBSERVER_ID should be appropriate.
>Anyway, even if it is only two times it is two much I think. I will use the trick, to use NS_ADDREF_THIS in the constructor of nsNSSComponent. > >This instance can live for the whole session. Why not just make it a service?
Hmm, what exactly do you suggest? I thought it is already a "service" because it is only constructed using do_GetService, and designed to be a singleton. However, the first reference to this service only lives for a short time during startup. And when this reference is freed, the instance is destroyed, NSS unloaded. Is there a possibility to declare something as a service, that won't be freed once it is loaded, which I'm not aware of?
> He said during profile creation, unloading/loading of NSS will happen even > more times, around 4 times. The reason this happens is that the creation of the component fails so it's never assigned to the service mgr which would maintain a ref to it.
Blocks: 105167
Attached patch Updated patch / security directory only (obsolete) (deleted) — Splinter Review
This is a reimplementation of the previous patch using a new 3rd XPCom module that lives inside mozilla/security. I was working on several related bugs together this week and I realized they are correlated and the patches would have had conflicts. Therefore, in this patch are included the fixes for bug 99566 and bug 105167, too. Because of 105167, we now always support some kind of read-only mode. Not only we have been asked to support that - there are good reasons to do it because of profile switching. The "NSS read/write works" environment can cange between different profiles. As NSS/PSM init no longer depends on the profile directory or good certificate databases, I removed the trick with IsPSMWorking from the factory construction code. We still need to make sure that PSM behaves nicely while in NSS fallback mode (this includes, as described above, XUL preference overlays enhancements), but that's a different bug.
Attachment #57094 - Attachment is obsolete: true
Comment on attachment 58021 [details] [diff] [review] Updated patch / security directory only r=javi (With caveat of removing the debug statements from Makefile.in and removing the DLL variables in the IDL makefile.win.)
Attachment #58021 - Flags: review+
No longer depends on: 96058
I was going to ask about making a plugin to handle the certificate mime types. If the plugin were called the plugin could init psm on demand. But then i remembered, our arch doesn't trigger plugins for links, only inline content.
timeless: the right bug for discussion would be 109777. I got new ideas from bbaetz, he suggested to use nsIDocumentLoaderFactory. I will look into it and post my results in bug 109777.
Attached patch Fixed patch (obsolete) (deleted) — Splinter Review
After testing some changes were necessary, here is the complete list: - makefile changes as requested by Javi - updated REQUIRES list in Windows makefile - add missing NS_INIT_ISUPPORTS() to nsEntropyCollector::ctor - move mutex deletion from beginning to end of nsNSSComponent::dtor - return success from nsNSSComponent::Initialize, even when working in NSS fallback mode - When I removed the special factory macro, I ignored the fact that NSS must still be loaded before any other instance in the NSS module is created. I re-added the adjusted factory macro - but now triggering init only, no check being made, and to module nss only, not to module pki.
Attachment #58021 - Attachment is obsolete: true
+ else if (nsCRT::strcmp(aTopic, NS_XPCOM_SHUTDOWN_OBSERVER_ID) == 0 + || nsCRT::strcmp(aTopic, "quit-application") == 0) { + PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("nsNSSComponent: XPCom shutdown observed\n")); Are you sure you want to observe "quit-application" for this as well? I think that notification gets sent when the user exits from the file menu. With turbo mode, this doesn't mean that the app is really shutting down. I think observing xpcom shutdown is enough.
Comment on attachment 58081 [details] [diff] [review] Fixed patch r=javi
Attachment #58081 - Flags: review+
Conrad: Ok, if you think "xpcom-shutdown" is enough, I'll remove the "quit-application" observation code before I check it in. I registered for that additional event, because I saw problems while testing, not receiving any notifications. But I meanwhile learned that there is a known issue - see bug 110226.
Note that in my previous attachment, about 50% of the patch is caused by two files who move to a new directory. The files are nsSecureBrowserUIImpl.h and .cpp. When reviewing, you can ignore those 4 hunks in the patch, and look at this smaller file instead. It shows you the changes that I made to those files.
In light of bug 110226, I'll have to take that back :-/ If that isn't fixable right away, just check that "quit-application" doesn't come at an unexpected time with turbo mode.
conrad was right the first time - use "xpcom-shutdown" Now the trick is that you don't want to ever remove yourself from the xpcom-shutdown notification - just let yourself be removed by the observer service when the product shuts down. That bug about observer service corruption happens when an observer is removed during notification. ignore "quit-application" - its a bit of a hack, anyway.
Problem is, with bug 110226, if the observer *before* this one removes itself, we wont get notified. We'd have to yank calls to RemoveObserver() in response to xpxom-shutdown all over the place.
Attachment #58081 - Attachment is obsolete: true
Attachment #58122 - Attachment is obsolete: true
Comment on attachment 58153 [details] [diff] [review] To make reviewing easier / Changes of moved files only ok, it all looks good. sr=alecf
I want to make one additional change, which is necessary. In my patch, in method GetNSSDialogs, I use weak typing, by using nsISupports** as an argument. I need to change that to be strongly typed to nsISecurityWarningDialogs**.
Comment on attachment 58152 [details] [diff] [review] Updated Patch / as discussed with alecf Updating this patch to indicate that it has r=javi and sr=alecf.
Attachment #58152 - Flags: superreview+
Attachment #58152 - Flags: review+
Comment on attachment 59366 [details] [diff] [review] The additional little change I need to make sr=alecf
Attachment #59366 - Flags: superreview+
Comment on attachment 59366 [details] [diff] [review] The additional little change I need to make r=javi
Attachment #59366 - Flags: review+
Sorry for the patchwork. Conrad helped me do more testing. We found two issues, which the attached patch corrects. 1) I did not expect that NSS could be initialized even before a profile is selected. This happens in the "commercial application activation" phase. Therefore I'm adding safety checks to the profile event observation code. 2) Our special factory code needs to be more clever. If the nsNSSComponent itself is the first object requested, the "haveLoaded" flag needs to be set immediately, in order to prevent a recursion. This can occur because the nsNSSComponent class triggers the factory code in its init phase.
Comment on attachment 59439 [details] [diff] [review] Another required incremental patch Seems that adding the parameter to the NS_NSS_GENERIC_FACTORY_CONSTRUCTOR is unnecessary since all instances of the macro pass in PR_FALSE anyway. I don't see how that parameter will ever be true, since teh value is just passed through to sub callers. How does this solve the problem?
Javi, the patch contains the following line which has the parameter set to true: +NS_NSS_GENERIC_FACTORY_CONSTRUCTOR_INIT(PR_TRUE, nsNSSComponent, Init)
Comment on attachment 59439 [details] [diff] [review] Another required incremental patch OK. r=javi
Attachment #59439 - Flags: review+
Comment on attachment 59439 [details] [diff] [review] Another required incremental patch sr=alecf but I really think you ought to re-think NS_NSS_GENERIC_FACTORY_CONSTRUCTOR_INIT and consider using something like NS_IMPL_NSGETMODULE_WITH_CTOR_DTOR I really don't understand why this has to be so complicated. where are you passing in PR_TRUE to this?
Attachment #59439 - Flags: superreview+
If we can find a simpler way, sure. Let me summarize what we want: ----------------------------------------------------- Class nsNSSComponent, which implements several interfaces, does the core initialization. We must not instantiate any other class from this module, until the nsNSSComponent service has been created, because of the required (conditional) NSS init. Therefore, we need to hook into instance creation. We do that by using a modified version of NS_GENERIC_FACTORY_CONSTRUCTOR. We modified it to include a call to EnsureNSSInitialized. If anything else but nsNSSComponent is requested, the function create its first, before the requested object is created. Before the previous patch, we used the standard factory constructor the nsNSSComponent. But, this lead to the following: - nsNSSComponent is created - before the service had the chance to return from its init phase and become globally registered, the class itself needs more instances from our module (for content listener registration) - in order to create the other instances, our special factory code is reached - this sees that the indicator variable "haveLoaded" is not yet set, and tries to create it again => recursion The previous fixes that, it makes sure our haveLoaded indicator variable is already set during nsNSSComponent creation, so that later factory code invocations will not try to do it again. ----------------------------------------------------- That said: I wasn't aware of the existence of NS_IMPL_NSGETMODULE_WITH_CTOR_DTOR. It seems you can execute code when the module is loaded. In theory, we could use this to do our initialization. But without rewriting our code, the following pseudo were required within that module constructor: do_GetService(nsNSSComponent) Without knowing better, I'd guess it is not possible to create an instance from the load-in-progress module at this point in time. If you say, this is possible, I can make a simple change to get rid of the modified macros and can test whether it works. If you say, it is NOT possible, we would have to rewrite the way the nsNSSComponent does its initialization. But note the NSS init is not trivial. We need to obtain misc information and query other services in order to do the right thing, which all must be allowed from within the module constructor.
Blocks: 101263
all patches checked in, fixed
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
this is awesome!!! tinderbox is showing: -78.9% on leak ( 1.30MB down to 274KB ) -12.4% on heap size ( 9.75MB down to 8.54MB ) -5% on startup ( 6.85s down to 6.59s ) Thanks to kaie, dbaron and everyone else who helped!!!
The leak fix might well be from waterson's checkin for bug 45353, you should ask him if it's about that magnitude. If PSM is really leaking that much then it's something we've got to fix -- it will eventually start up, this only delays it.
No, there was another tinderbox building without --enable-crypto (on Ports) that showed a leak drop of 28K due to waterson's checkin (and no bloat (MH) gain). That implies this is almost entirely this checkin, which is what was expected from our previous measurements of NSS memory usage.
Verified.
Status: RESOLVED → VERIFIED
Product: PSM → Core
Version: psm2.1 → 1.0 Branch
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: