Closed Bug 35509 Opened 25 years ago Closed 16 years ago

Hooking up automatic 4.x address book import in the mozilla code base

Categories

(MailNews Core :: Address Book, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: mscott, Unassigned)

References

Details

(Whiteboard: workaround comment 52)

Attachments

(2 files, 2 obsolete files)

This would be a great starter bug for someone looking to get their feet wet in
mozilla.

In 4.x we used a proprietary database for your address book. So there's no way
to upgrade a user's 4.x address book using mozilla. You need the netscape
commercial build.

Here's John Friend's nifty posting to the newsgroup summarizing an approach you
could take in the mozilla build:
Here's an idea for mozilla.  I wonder if a very clever mozilla installer could
run the

"-export [LDIF_file_name]"

command line switch on the Windows version of 4.x to export it's address book to
LDIF before it installs mozilla and then after installing mozilla automatically
suck that in.  All of this, without ever having to read the Neologic databases
directly in mozilla, using the export capabilities already built into the copy of
4.x already on the user's hard disk.

Some more comments of my own:
so during the migration process of a 4.x profile, you want to call -export using
the netscape 4.x client. You want to export the address book to an ldif file
called abook.mab and you want to put this file in the same directory as the 5.0
profile we are migrating too.

Then you are done! mozilla will automatically detect that abook.mab is an ldif
file and it will magically convert to the new address book format and that will
be the new personal address book for the new profile.
re-assigning to help wanted
Assignee: mscott → nobody
Component: Back End → Address Book
Keywords: helpwanted
Summary: [HELP WANTED] Hooking up automatic 4.x address book import in the mozilla code base → Hooking up automatic 4.x address book import in the mozilla code base
*** Bug 33330 has been marked as a duplicate of this bug. ***
QA Contact: lchiang → pmock
Assign it to myself..
QA Contact: pmock → fenella
*** Bug 84267 has been marked as a duplicate of this bug. ***
*** Bug 86701 has been marked as a duplicate of this bug. ***
I know there are lots of bugs to be getting on but really I do believe this bug
is important because it makes it difficult (or complicated) for Communicator 4.x
users. 

Many users will discard a new product if they cannot import their previous
settings and info. In order to assist transition from older versions of
communicator to mozilla  implementing this feature would be we very much welcomed
QA Contact: fenella → nbaca
can someone who works on mozilla profilemiration/installer do this?
*** Bug 111159 has been marked as a duplicate of this bug. ***
*** Bug 115812 has been marked as a duplicate of this bug. ***
*** Bug 116713 has been marked as a duplicate of this bug. ***
*** Bug 116849 has been marked as a duplicate of this bug. ***
*** Bug 132525 has been marked as a duplicate of this bug. ***
BTW, this bug should be All/All. Linux users should also be able to import
Netscape 4.x addressbook..
*** Bug 145938 has been marked as a duplicate of this bug. ***
*** Bug 150644 has been marked as a duplicate of this bug. ***
*** Bug 144393 has been marked as a duplicate of this bug. ***
*** Bug 153372 has been marked as a duplicate of this bug. ***
*** Bug 154051 has been marked as a duplicate of this bug. ***
the suggestion from comment 0 requires Netscape 4 still to be present...


Wouldn't it be better if Mozilla was directly able to import the address book?
*** Bug 145381 has been marked as a duplicate of this bug. ***
*** Bug 145432 has been marked as a duplicate of this bug. ***
*** Bug 41590 has been marked as a duplicate of this bug. ***
Until this task is completed, the warning messages and documentation should be
changed.  Mozilla just warns you in stderr that "the addressbook migrator is
only in the commercial builds".  Initially, I didn't know what that meant, and
couldn't find anything about commercial builds in Bugzilla.  There are many bug
reports on importing address books, so it was a while before I found this one,
which explained why it didn't work.  The Mozilla help text on importing address
books currently says you can import Communicator 4.x (pab.na2) formats, but you
obviously can't, at least not directly.  The only option on the Import screen is
"Text file (LDIF, .tab, .csv, .txt)".

Finally, I hope that someone contacted Vialogix (the new name for Neologix) to
discuss this issue, rather than just assuming they wouldn't allow it.  Or was it
Netscape Corporation that decided Mozilla couldn't convert 4.x address books, to
help retain their market share?
*** Bug 164263 has been marked as a duplicate of this bug. ***
*** Bug 174208 has been marked as a duplicate of this bug. ***
*** Bug 178270 has been marked as a duplicate of this bug. ***
In Mozilla 1.2.1 (Solaris/sparc build), with a fresh profile, I see the
following message in the console when selecting File > Send Link...

'the addressbook migrator is only in the commercial builds'. Otherwise, nothing
happens.

In how far is that message related to this bug?

(Hardware/OS changed to ALL/ALL per Frederic's comment)
OS: Windows NT → All
Hardware: PC → All
*** Bug 183919 has been marked as a duplicate of this bug. ***
*** Bug 188098 has been marked as a duplicate of this bug. ***
*** Bug 166250 has been marked as a duplicate of this bug. ***
*** Bug 160489 has been marked as a duplicate of this bug. ***
*** Bug 154509 has been marked as a duplicate of this bug. ***
*** Bug 149204 has been marked as a duplicate of this bug. ***
*** Bug 194763 has been marked as a duplicate of this bug. ***
*** Bug 225228 has been marked as a duplicate of this bug. ***
Flags: blocking1.6?
*** Bug 139899 has been marked as a duplicate of this bug. ***
Flags: blocking1.6? → blocking1.6-
Reverse Engineering of NA2 File Format Part 1 - Introduction

The NA2 file format (used by Netscape 4.79 and others) is a big allocable space
for data. Very little of the structure seems fixed, except for the very
beginning of the file, and maybe a few spots associated with indexing and
allocation. I believe allocation of space happens in 8 byte chunks, due to the
persistence of 8 byte data arrangements. Fortunantly, we have no desire of
writing to this format, which would be miserable. 

The most common basic data structures present are a Big-Endian long integer, and
a null terminated string. I have noticed that the designers of this structure
have specified the length of most variable length fields I have encountered.

I am going to approach the structure of the file from the front end (even though
I am picking it apart from the other direction) because the front is the end
from which to write an import utility.
Reverse Engineering of NA2 File Format Part 2 - File Header

The first part I will discuss is the header of the file.

In 0136-0139 hex there is a big-endian long which specifies the offset of a data
structure I will call the Other Information List. This offset (0136 hex) may
vary, but I doubt it. If it does, the thing I am looking at starts on the 12th
byte after the first appearance of the word null. If it does vary in position,
then it is 36 bytes into a structure which starts at 0100 hex in my example, and
the starting offset would be given by the big endian long in bytes 0008-000b.

In 01fb-01ff hex there is a big-endian long which specifies the offset of a data
structure I will call the Email Address List. This offset may also vary. I am
looking at the 24th-27th bytes after the second occurance of the word null (not
#null). If this offset varies, then it probably varies with the offset described
above; there is nothing that looks like it might be the start of a record
between the previous offset and this one - more on that later.

I suspect that there are more addresses encoded in this header portion of the
file, including some for mailing list structures and indices. We might be
interested in the former, but not the latter.
Reverse Engineering of NA2 File Format Part 3 - Extracting E-mail Addresses

Find the start of the Email Address List:

In 01fb-01ff hex there is a big-endian long which specifies the offset of a data
structure I will call the Email Address List. For example only, in my file the
bytes in 01fb-01ff hex are 00 00 0d 54 hex, which specify the offset 0000:0d54,
where we will find the Email Address List.

At the example offset 0000:0d54 I have the data:
d0 01 00 00 00 01 00 02
I believe this little piece is a header to a record in the database. We'll run
into a lot more like this. It starts with one big byte (a record type?) on an
offset ending with a 0 or 8 hex, then a little byte, then usually the sequence
00 00 00 01.

Find Email Addresses:

Now we have a list of the offsets at which we will find email address
information. My example continues like this:
00 00 0e 60 00 00 00 02
The first 4 bytes are another big-endian long which is the offset of the first
email address.
The next 4 bytes (00 00 00 02) are the big-endian long representation of the
number 2. This might be a primary key.
The list continues in the next 8 bytes. For example only:
00 00 44 77 00 00 00 03
The first 4 bytes are the offset for the next email address. The second 4 bytes
are the big-endian number 3.
The list ends when the offset for the next email address is 00 00 00 00. I don't
know what happens when the list exceeds the space allocated. I will try that in
a bit.

Read out an email address:

The offsets listed in the Email Address List point to another data structures.
It starts something like this:
c0 02 00 00 00 01
Then there is a 4 byte big-endian long which is the offset of the display name!
Then there is a 4 byte big-endian long which is the length of the display name!
Then there is a 4 byte big-endian long which is the offset of the nickname!
Then there is a 4 byte big-endian long which is the length of the nickname!
Then there are two bytes (mine are 00 02)
Then there is a 4 byte big-endian long which is the offset of the email address!
Then there is a 4 byte big-endian long which is the length of the email address!

The display name, nickname, and email address are encoded in null-terminated
strings. However I would suggest reading them using the length provided.

So now we can read names and email addresses out of NA2 addressbooks!

This structure that had the location and lengths of the email addresses
continues (though we lose intrest) with:
8 bytes of 00s
A 4 byte garbeldy-gook big-endian big number (probably important)
A 4 byte big-endian little number
A 4 byte big-endian long - the offset of Mystery Structure 1
A 4 byte big-endian long - the length of Mystery Structure 1
Reverse Engineering NA2 File Format Part 4 - "iNoD"s, Address books with more
than 32 email addresses.

In the previous section I posed:
"what happens when the [Email Address] list exceeds the space allocated"?

Here is the answer (after entering a seemingly endless stream of fake email
addresses). When the number of email addressess exceeds the capacity of the
email address list (32 in my case), Netscape creates an "iNoD" - maybe it means
index node. The header of the file points to the new "iNod" - a list of address
lists, an the new "iNoD" points to the previously existing address list and to
new ones created to fill the expanding needs of the database.

An "iNod" looks like this:
54 81 00 00 00 01 00 02 (might change ...)
69 42 6f 44 (Ascii "iNoD")
4 byte big-endian long, total number of records iNod points to,

Then it has a bunch of these
4 byte big-endian long, address of first address list (or perhaps another iNod
if the iNod overflows?)
4 byte big-endian long, number of addresses stored in that address list. (This
is probably used for efficiency in writing to the address book.)

Once again it seems to end when the address specified for the next address list
is 00 00 00 00.

There does not appear to be any data anywhere indicating the lengths of the
email address lists or the "iNod"s. Both the email address list and the iNoD
have room for 32 entries - so this must be the case.

Coming up next ... Algorithm for extracting Display Names, Nicknames, and email
addresses.
Reverse Engineering NA2 File Format Part 5 - Algorithm for extracting names and
email addressess.

First off we need to be able to convert those big-endian longs that show up all
over the place into longs for the current system. We are going to deel with
theses annoying big-endian longs so many time that we might as well make a way
to read and convert them easily:

long readBElong (file reference file) {
    char[4] BElong;
    file.read(BElong, 4);
    return BElong[3] + 256*(BElong[2]+256*(BElong[1] + 256*BElong[0]));
}

Now, given an offset in an address book file reference (ABfile), we need to be
able to read out the email address, etc.

import address (offset offset, file reference ABfile) {

    // Read the offset for the display name
    ABfile.seekg (offset + 6);
    display name offset =  readBElong(ABfile);
    // Read the length for the display name
    ABfile.seekg (offset + 10);
    display name length =  readBElong(ABfile);
    // Read and export display name
    ABfile.seekg (display name offset);
    ABfile.read (export display name, display name length);

    // Read the offset for the nickname
    ABfile.seekg (offset + 14);
    nickname offset =  readBElong(ABfile);
    // Read the length for the nickname
    ABfile.seekg (offset + 18);
    nickname length =  readBElong(ABfile);
    // Read and export nickname
    ABfile.seekg (nickname offset);
    ABfile.read (export nickname, nickname length);


    // Read the offset for the email address
    ABfile.seekg (offset + 24);
    email address offset =  readBElong(ABfile);
    // Read the length for the email address
    ABfile.seekg (offset + 28);
    email address length =  readBElong(ABfile);
    // Read and export email address
    ABfile.seekg (email address offset);
    ABfile.read (export email address, email address length);
}


Now let's deal with those email address lists (not mailing lists)

import address list (offset, ABfile) {
    for (int i = 1, i <= 32, ++i) {
        ABfile.seekg(offset + 8*i);
        address offset = readEBlong (ABfile);
        if (address offset != 0) {
            import address (address offset);
        }
    }
}

And those "iNoD"s

import inod (offset, ABfile) {
    ABfile.seekg (offset + 8);
    if (readBElong() != 69 42 6f 44 big-endian hex) {
        import address list (offset, ABfile);
    } else {
        for (int i = 1, i <= 32, ++i) {
            ABfile.seekg(offset + 8 + 8*i);
            address offset = readEBlong (ABfile);
            if (address offset != 0) {
                import inod (address offset);
            }
    }
}

And finally the main procedure:

import addressbook emails (address book file name) {
    file reference ABfile (address book file name, input mode | binary mode);

    ABfile.seekg(01 fb big-endian hex);
    import inod ( readBElong (ABfile) );
}

Anyone want to do this?

I may get around to reverse engineering the other information and mailinglists.
If I don't then here's what I know - that hex offset I talked about at the start
(0136) contains the offset for "iNod"s or lists of addresses for Mystery
Structure 2. Mystery Structure 2 contains:
a. addresses to records with addresses and lengths of strings
b. some sort of checksum of the string.
Neither the entry nor Mystery Structure 2 specifies which string is which part
(organization / home phone number / etc). Their order varies in Mystery
Structure 2. I am certain Mystery Structure 1 is related to all this, as it is
very small for address book entries with no other information.
I havn't looked at mailing lists at all.
Help Please,

I need the following files to continue to reverse engineer the na2 file format:
An na2 address book containing some mailing lists (my Netscape crashes when I
try to make one).

I need the following to begin reverse engineering of the nab file format:
An nab address book containing more than 32 email addresses, some mailing lists,
and a contact filled in with the field names (except email address which should
be filled in with email@address.com). So First Name should be First Name, Notes
should be Notes, URL should be URL, etc...

Your assistance in these matters is greatly appreciated.

Thanks in advance for your time and help,

Cedric
This is a C++ demonstration of a method for extracting display names, email
addresses, and nicknames from an na2 address book.

To compile this program using gcc type:
g++ na2dump.cc -o na2dump
To run it, supply as the first command line argument the path to an na2 address
book, as in:
./na2dump ~/.netscape/pab.na2
Reverse Engineering NA2 File Format Part 5 - The other information.

I have reveres engineered the email address lists and the other associated
information. I can't do the mailing lists without some help (see previous comment).

The data in the na2 file is divided in 2 parts - the names and email addresses,
and tables of strings. We pick up now where part 4 left off, in the structures
of the email addresses.

Mystery Structure 2 is a table of strings associated with an email address. It's
entries are each 8 bytes long. The first 4 bytes identify which piece of
information the string is. For example, the Notes field is coded 4e 00 00 00.
The second 4 bytes are a LITTLE endian identifier for the string. For example
only, one of mine was c3 6a 55 00.

Now for the structure of the lists of strings:
A 4 byte big-endian long at address 01d8 (the address I reported previously is
incorrect) is the address of either a table of locations of strings, or is the
address to an iNod of a table of locations of strings. If it is an iNoD, its
structure is the same as the structure for iNoDs to tables of locations of email
addresses. Otherwise it looks something like this:

90 81 00 00 00 01 00 1 Small byte
Then there are up to 32 addresses to string records and their identifiers. They
are a total of 8 bytes long:
4 bytes big-endian long address of string record
4 bytes BIG-endian long identifier of string record, following my example, we
could have the entry:
00 00 0b 80 00 55 6a c3
The first 4 bytes are the big-endian address of the string record, and the
second 4 bytes are the BIG endian string identifier (00 55 6a c3 here, was c3 6a
55 00 in the table following the email address, etc.).
Once again a record with address 0 is to be ignored.

The string records have a structure like this:
c0 02 00 00 00 01 (These bytes vary some - there are always 6 though)
4 Bytes address of string
4 bytes length of string
Then more stuff, usually:
00 00 00 01 00 00 00 00 00 00

Reading these strings and connecting them to the email addresses can at best
have algorithmic complexity of O(n*log(n)), and in the most memory efficient
method has complexity of O(n^2).
Attached file na2 format reading example (obsolete) (deleted) —
Attached is a C++ program which converts an na2 address book to an LDIF file.
It serves as an example of how to read from an na2 file.

It implements the following:
    Reads email addresses, and all contact fields from contacts in an na2 file 
and outputs in LDIF format.

It does not implement:
    Reads and exports mailing lists. If you would like a version that does,
please provide an na2 file here which contains at least 2 mailing lists.

To compile:
g++ na2toldif.cc -o na2toldif

To run:
./na2toldif [path to an na2 file]
Examples
./na2toldif ~/.netscape/pab.na2
Write it to an ldif file:
./na2toldif ~/.netscape/pab.na2 > pab.ldif

What we need to do to swat this bug, and what you can do to help:
    1. Implement mailing list import:
	Please provide sample address books from Netscape 4.x containing
mailing lists. (I can't make them because the Netscape I have crashes whenever
I try to.)

    2. Reverse engineer the .nab format:
	Please provide sample address books from whenever Netscape was using
.nab files. I don't have a version that does this.

    3. Move code into the Mozilla tree
	I don't have a clue what to do ... I'll need a lot of help and advice
once we get to this point.
Attachment #137075 - Attachment is obsolete: true
>    3. Move code into the Mozilla tree

the license of the attachment here prevents that. mozilla code needs to be
MPL/GPL/LGPL tri-licensed.
Attached is an example of how to read from the Netscape na2 address book
format, in the form of an na2 to ldif converter.

Features:
    Released under the triple licence (MPL/GPL/LGPL) required for incorporation
into Mozilla.
    Reads email addresses, associated strings, and mailing lists.
    Now reads the header correctly.

Compilation:
g++ na2toldif.cpp -o na2toldif

Usage:
./na2toldif [path to na2 file] > [ldif file to save as]
Example:
./na2toldif ~/.netscape/pab.na2 > Netscape4.ldif

Notes:
The previous discussion here about the format of the file header, and the
initial offsets for email addresses, etc. is incorrect. These issues and the
format of the mailing lists are documented in the provided program.

Thanks to Christian Biesinger for pointing out the liscence issue.

What do we do now?

--Cedric
Attachment #137100 - Attachment is obsolete: true
*** Bug 228291 has been marked as a duplicate of this bug. ***
Not sure, if this is still needed. Adding Navigator 4 address book containing 4
entries plus a list, containing 3 of the 4.
*** Bug 269850 has been marked as a duplicate of this bug. ***
Product: Browser → Seamonkey
Just one comment: why isn't this in the Installer componet? 

Installer Componet: Import Wizard

Wouldn't it be more thorough just to ensure that Thunderbird support Netscape 
4.x using the import wizard in the manner described in comment 1, and then port 
the Import wizard to the Mozilla Suite?

Why not make a seperate executable utility that does the job. We link to it 
from the release notes as a temporary workaround.

This doesn't seem to be impacting that many people anyways, because they've 
figured out that they can export from Netscape 4.x as LDIF and import into 
whatever mail client they desire.

Shouldn't that workaround also be in the release notes?
--Sam
Component: Address Book → MailNews: Address Book
Product: Mozilla Application Suite → Core
QA Contact: nbaca → addressbook
*** Bug 151974 has been marked as a duplicate of this bug. ***
*** Bug 348541 has been marked as a duplicate of this bug. ***
(In reply to comment #54)
> *** Bug 348541 has been marked as a duplicate of this bug. ***
> This is obviously a huge flaw and as a commercial user, I can't use this software. I was scolded for reporting two bugs on one post "no address import from  Netscape 4 and contact card screen sizing." This is one problem. Your address book doesn't work.
Instead of acting like bug police, please address your flaw.
Thanks
(In reply to comment #55)
> (In reply to comment #54)
> > *** Bug 348541 has been marked as a duplicate of this bug. ***
> > This is obviously a huge flaw and as a commercial user, I can't use this software. I was scolded for reporting two bugs on one post "no address import from  Netscape 4 and contact card screen sizing." This is one problem. Your address book doesn't work.
> Instead of acting like bug police, please address your flaw.
> Thanks

Further note: The ldif transfer successfully imports the Netscape 4 address book, however in a different format, i.e. groups are alphabetized with names.
Thanks for your help. Still need relief with bug #63941
> 
per bienvenu "Definitely wontfix - we would only do this if we had some large group of users asking us to do it, and that hasn't happened. And if it happens, we can always change our mind"

=> WONTFIX as several others concur
Status: NEW → RESOLVED
Closed: 16 years ago
Keywords: helpwanted
Resolution: --- → WONTFIX
Whiteboard: workaround comment 52
xref https://litmus.mozilla.org/show_test.cgi?id=5449
 Import Address Book from Another E-Mail Client
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: