Closed Bug 298044 Opened 19 years ago Closed 19 years ago

Dynamically load important dependent libs for embedders so that they don't have to setup the environment

Categories

(Core Graveyard :: Embedding: GRE Core, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.8beta3

People

(Reporter: benjamin, Assigned: benjamin)

References

Details

Attachments

(5 files, 2 obsolete files)

We've had for a long time the problem that on unix/mac you have to set
(DY)?LD_LIBRARY_PATH *before* you launch your process if you intend to embed
XPCOM/gecko. This is bad, and I finally have a workable solution:

When the embedder finds a compatible glue and calls XPCOMGlueStartup(), the glue
will first look for a dependent-libs.list and manually load each of these libs
in order, before it loads the xpcom sharedlib itself.

I intend as part of this patch to drop all the dependentLibs stuff that is in
nsIGenericFactory, as it has never worked the way it was intended and will no
longer be needed.
on linux, i thought that we fixed this problem with dependent libraries in our
compreg.dat file.
That only fixes the problem if 1) compreg.dat doesn't go away and 2) you have
LD_LIBRARY_PATH set up the first time you autoregister, both of which are faulty
assumptions in most embedding context and in toolkit applications.
dependent-libs.list sounds like a good solution to me.  it also seems a bit more
optimal since we will only need to process it once.
Blocks: 298047
Just a note that this patch does not add the dependentlibs.list file itself: I
need to do that from the xpcom/stub makefile or somewhere else useful, but it
depends on libxul/notlibxul/static/not-static so it's not a simple file-copy
operation.
Comment on attachment 188442 [details] [diff] [review]
Load dependencies explicitly from dependendlibs.list instead of relying on LD_LIBRARY_PATH, rev. 1

Needs a bit more work.
Attachment #188442 - Flags: review?(darin)
You sure you don't want to find some way to just compile prlink.c into the
standalone xpcom glue?
I looked at that possibility but trying to remove all the depenencies that
prlink has on other NSPR stuff (locks, string handling, etc) make the task
gargantuan compared to this.
Attachment #188442 - Attachment is obsolete: true
Attachment #188447 - Flags: review?(darin)
OK, it sounds like you've made the right choice then w.r.t. duplicating cross-
platform load library code.  It would probably be good to find some way to
simply not compile embedding support (xpcomglue) on platforms that lack an
implementation of this GRE loading stuff.  Otherwise, we will just have a bunch
of broken Firefox ports when Firefox doesn't even depend on this stuff.  Or do
you have some other plan for getting this ported everywhere?
The dynamic linking code is not hard to write: I figured that breaking various
ports would get them to write the correct code, plus I plan to notify various
port owners before landing this. I can send out that email now if the general
approach in this patch looks correct.
Comment on attachment 188447 [details] [diff] [review]
Load dependencies explicitly from dependendlibs.list instead of relying on LD_LIBRARY_PATH, rev. 1.1

>Index: xpcom/glue/standalone/nsGlueLinking.h

>+#ifndef nsGlueLinking_h__
>+#define nsGlueLinking_h__
>+
>+#include "nsXPCOMPrivate.h"
>+#include <stdio.h>
>+
>+#define XPCOM_DEPENDENT_LIBS_LIST "dependentlibs.list"
>+
>+NS_HIDDEN_(GetFrozenFunctionsFunc)
>+XPCOMGlueLoad(const char *xpcomFile);
>+
>+NS_HIDDEN_(void)
>+XPCOMGlueUnload();
>+
>+typedef void (*DependentLibsCallback)(const char *aDependentLib);
>+
>+NS_HIDDEN_(void)
>+XPCOMGlueLoadDependentLibs(const char *xpcomDir, DependentLibsCallback cb);
>+
>+#endif // nsGlueLinking_h__

It looks like you can remove the #include <stdio.h> from this
header file.  Also, it is usually a good idea to provide a
|void*| data parameter along with any callback function.  That
way future implementations could pass additional state along
to the callback function.  This isn't a requirement here since
this API could be changed in the future if needed, but I just
thought I'd mention it anyways.


>Index: xpcom/glue/standalone/nsGlueLinkingDlopen.cpp

>+static void
>+ReadDependentCB(const char *aDependentLib)
>+{
>+    void *libHandle = dlopen(aDependentLib, RTLD_GLOBAL | RTLD_LAZY);
>+    if (!libHandle)
>+        return;
>+
>+    AppendDependentLib(libHandle);
>+}

Maybe this function should output a warning in debug builds
when dlopen fails.


>+
>+GetFrozenFunctionsFunc
>+XPCOMGlueLoad(const char *xpcomFile)
>+{
>+    char xpcomDir[MAXPATHLEN];
>+    if (realpath(xpcomFile, xpcomDir)) {

According to the man pages on my linux system (based on redhat 9),
the second parameter to realpath should be PATH_MAX bytes long.
Is MAXPATHLEN equal to PATH_MAX?  The man page mentions that on
solaris the value should be MAXPATHLEN instead of PATH_MAX FWIW.


>+        char *lastSlash = strrchr(xpcomDir, '/');
>+        if (lastSlash) {
>+            *lastSlash = '\0';
>+
>+            XPCOMGlueLoadDependentLibs(xpcomDir, ReadDependentCB);
>+        }
>+    }
>+
>+    void *libHandle = RTLD_DEFAULT;
>+
>+    if (xpcomFile[0] != '.' || xpcomFile[1] != '\0') {
>+        libHandle = dlopen(xpcomFile, RTLD_GLOBAL | RTLD_LAZY);
>+        if (libHandle) {
>+            AppendDependentLib(libHandle);
>+        }

this looks a lot like ReadDependentCB.	should that function
maybe be called here?  though you would then need a way to
find out if dlopen succeeded.  maybe ReadDependentCB should be
called LoadDependentLib and return a status code.


>+        else {
>+            libHandle = RTLD_DEFAULT;
>+        }
>+    }
>+
>+    // XXXbsmedberg: some OSes prefix C symbols with "_", how do we know?
>+    GetFrozenFunctionsFunc sym =
>+        (GetFrozenFunctionsFunc) dlsym(libHandle, "NS_GetFrozenFunctions");

According to NSPR's prlink.c:

  /*
   * On these platforms, symbols have a leading '_'.
   */
  #if defined(SUNOS4) || defined(DARWIN) || defined(NEXTSTEP) \
      || defined(WIN16) || defined(XP_OS2) \
      || ((defined(OPENBSD) || defined(NETBSD)) && !defined(__ELF__))
  #define NEED_LEADING_UNDERSCORE
  #endif

Perhaps you should just duplicate the logic from prlink.c


>Index: xpcom/glue/standalone/nsGlueLinkingMac.cpp

Our low-level code tends to use the suffix "Mac" to mean OS9
and "OSX" to mean OSX.	Should we do the same here?



>Index: xpcom/glue/standalone/nsGlueLinkingWin.cpp

>+// like strpbrk but finds the *last* char, not the first
>+static char*
>+ns_strrpbrk(char *string, const char *strCharSet)
>+{
>+    char *found = NULL;
>+    for (; *string; ++string) {
>+        for(const char *search = strCharSet; *search; ++search) {

nit: space between "for" and "("

>+            if (*search == *string) {
>+                found = string;
>+            }

add a comment here explaining that you are going to keep 
searching until the end of the string.	(it might make
sense to compute the length of the input string up front
and then search the string in reverse order, but this is
probably fine.)


>Index: xpcom/glue/standalone/nsXPCOMGlue.cpp

>+void
>+XPCOMGlueLoadDependentLibs(const char *xpcomDir, DependentLibsCallback cb)
>+{
>+    char buffer[MAXPATHLEN];
>+    sprintf(buffer, "%s" XPCOM_FILE_PATH_SEPARATOR XPCOM_DEPENDENT_LIBS_LIST,
>+            xpcomDir);
> 
>+    FILE *flist = fopen(buffer, "r");
>+    if (!flist)
>+        return;
> 
>+    while (fgets(buffer, sizeof(buffer), flist)) {
>+        char *nl = strpbrk(buffer, "\n\r");

hrm... i thought that the win32 version of fgets converted
\r\n to \n, no?  you didn't pass the "b" flag to fopen, so
i think calling strpbrk here is unnecessary.  you should be
able to just test the last char of buffer for \n.


>+        if (nl)
>+            *nl = '\0';
> 
>+        char buffer2[MAXPATHLEN];
>+        sprintf(buffer2, "%s" XPCOM_FILE_PATH_SEPARATOR "%s",
>+                xpcomDir, buffer);

use snprintf.
Attachment #188447 - Flags: review?(darin) → review-
(In reply to comment #11)
> The dynamic linking code is not hard to write: I figured that breaking various
> ports would get them to write the correct code, plus I plan to notify various
> port owners before landing this. I can send out that email now if the general
> approach in this patch looks correct.

Yes, I like your solution to this problem.  Notify ports owners sooner rather
than later.  How about implementing a NULL version of these functions that
simply do nothing?  That way ports can use those versions to keep Firefox useful
until the ports authors have a chance to implement these functions.
I didn't display load warnings when dlopen failed because I'm pretty certain
there were cases where I expected dlopen to fail but the general load to
succeed. I'm hard-pressed to think of one right now.
Oh, and I fixed the "for(" in my tree (trying to keep multiple platforms in sync).
If you're applying this patch, I forgot to remove the #include
"nsINativeComponentLoader.h" from nsNativeComponentLoader.h
Priority: -- → P1
Target Milestone: --- → mozilla1.8beta3
Attached patch OS/2 glue linking code (deleted) — Splinter Review
(In reply to comment #15)
> I didn't display load warnings when dlopen failed because I'm pretty certain
> there were cases where I expected dlopen to fail but the general load to
> succeed. I'm hard-pressed to think of one right now.

Perhaps add a comment explaining this?
Attachment #188459 - Flags: review?(darin) → review+
Attachment #188459 - Flags: approval1.8b4?
Attachment #188459 - Flags: approval1.8b4? → approval1.8b4+
I checked the main patch in, and it caused some problems on a couple of -ports
where RTLD_DEFAULT is not defined (non-GNU toolchains, apparently?). In any
case, RTLD_DEFAULT is "(void*) 0" so I feel safe in using nsnull directly.
Attachment #189346 - Flags: review?(darin)
I (RH7.3 gcc2.96) don't have RTLD_DEFAULT either.
Attachment #189346 - Flags: review?(darin) → review+
Attachment #189411 - Flags: review?(darin)
According to bug 301043 comment #4, this has busted suite windows installer
builds. How can we get that fixed?
Blocks: 301043
Attached patch Patch for AIX (deleted) — Splinter Review
AIX supports dlopen/dlclose, so all that is needed to add support is to toggle
it in the makefile.
Attachment #189971 - Flags: review?(benjamin)
Attachment #189971 - Flags: review?(benjamin)
Attachment #189971 - Flags: review+
Attachment #189971 - Flags: approval1.8b4+
+ifeq (,$(filter-out WINNT WINCE,$(OS_ARCH)))
+DEPENDENT_LIBS_LIST += js$(MOZ_BITS)$(VERSION_NUMBER)$(DLL_SUFFIX)
+else
+DEPENDENT_LIBS_LIST += $(LIB_PREFIX)mozjs$(DLL_SUFFIX)

why this, in an xpcom directory?
AIX fix checked in. Thanks for the review.

Checking in Makefile.in;
/cvsroot/mozilla/xpcom/glue/standalone/Makefile.in,v  <--  Makefile.in
new revision: 1.22; previous revision: 1.21
done
Comment on attachment 189411 [details] [diff] [review]
Generate dependentlibs.list, rev. 1

r=darin
Attachment #189411 - Flags: review?(darin) → review+
The dependency chain is xpcom.dll -> xul.dll -> js3240.dll
ok hm, could that be generated from the libxul makefile then?
Checked in the dependentlibs.list patch.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Depends on: 444500
QA Contact: gre
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: