Closed Bug 177886 (rdf-bookmarks) Opened 22 years ago Closed 18 years ago

Store bookmarks in a RDF format

Categories

(SeaMonkey :: Bookmarks & History, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: kairo, Unassigned)

Details

As bug xbel was mis-used for discussion of changing our default bookmarks
format, here's a seperate bug for that.

Copying some comments of that bug:

------- Additional Comment #8 From Robert Kaiser  2000-10-10 15:12 -------

And why not simply save it to an RDF file? It is already an RDF datasource
internally...
Mozilla could provide an export feature to this in "Manage Bookmarks..." which
writes Netscape style or XBEL or even a standards compliant (X)HTML file...


------- Additional Comment #9 From Ben Bucksch 2000-10-10 16:01 -------

KaiRo has a point. We don't need to use an exchange format internally, just for
ex-/imports. Changing SUMMARY from "Mozilla should use XBEL" to "Mozilla should
support XBEL".

------- Additional Comment #12 From Ben Bucksch  2000-10-11 13:08 -------

> XML is the logical, standards compliant language to use for storing bookmarks. 

Wrong. It is not the logical format. The RDF format we use internally,
serialized as XML, is the logical format.
If we use RDF internally during runtime, why not store it interally as RDF?
For "standards-compliant" (XBEL is no standard) data exchange, you can offer
options to export and import to and from XBEL and Netscape's HTML bookmark
format. In the UI, this means *no* difference to your proposal.

------- Additional Comment #16 From Ben Bucksch  2000-10-11 15:45 -------

I think, we're all on the same track now. I don't think, this bug is hard to
fix.
1. Remove the RDF->Netscape-Bookmark-HTML converter and just serialize the RDF
into XML and save that and the other way around (load) (you might find examples
for that in the localstore.rdf writer/loader).
2. Write XSLT "stylesheets" to transform Mozilla-Bookmark-RDF-XML<->XBEL and
Mozilla-Bookmark-RDF-XML<->Netscape-Bookmark-HTML (and possibly
Mozilla-Bookmark-RDF-XML<->some-sane-HTML :) ).
3. Hook up the XSLT-based converters in the UI. We will propably need XSLT
enabled in Mozilla for that.

------- Additional Comment #22 From Fabian Guisset  2001-02-01 05:11 -------

Since we already use RDF for the bookmarks management (i.e. in the chrome, we
use RDF trees, etc), I see no reason why the bookmarks.html file shouldn't
become a bookmarks.rdf file. We would not need to change the chrome that way. It
would also speed up a little the bookmarks since we wouldn't have to convert
from html to rdf. Then we can support XBEL as import/export language, as
Timeless suggested.
This is just my opinion.

------- Additional Comment #25 From Ben Goodger  2001-02-14 17:51 -------

My plan is to convert the bookmarks service to read and produce serialized RDF, 
and limit use of the HTML->RDF parser to import and export activities. This 
should allow for quicker development of new features, and provide a richer API 
for third parties manufacturing bookmark management utilities. 

I propose something like a XUL template in an XHTML document to produce a live 
view of the users's bookmarks that can be loaded into a browser content area.
It seems Bugzilla doesn't like my use of the bug's alias in the "bug xxxx" test.

OK, it's bug 55057 - hope it links it now :)

BTW, all comments in that bug were in favor of that change, nobody said anything
against this...
Please, please, please store as XBEL, and allow setting of where the XBEL file is. If mozilla supported using the KDE bookmarks file, I'd use it much more often. As it is though, I have to put up with an XSLT over my bookmarks. I don't actually get to use the bookmarks from the bookmarks menu though.  KDE has used XBEL for quite a while, and it's well-integrated into the system (meaning the bookmarks are not just used by Konqueror).  Please let me use my KDE bookmarks from Mozilla! Please let me bookmark locations in Mozilla and have them available from KDE!  (sorry, this will come across really strongly, but I believe that since you're still in the planning stage, an impassioned plea might actually sway you to whay I believe is the right course of action :)  
Gah, sorry about the wrapping there... dunno what happened... 
 
Alias: rdf-bookmarks
[about dumping HTML as storage format:]
There one big reason that shold concern the standards-aware Mozilla community:
Bookmarks HTML format doesn't comply to the any HTML standard, IIRC. And that's
a main reason to drop it. That argument would leads us to string bookmarks as
XBEL, of course, as this is a standard.

Why I (and others) propose RDF is that it may save performance just to
dump/serialize the RDF datasource into an RDF file, and esp. to read it in from
there.

Of course, we need an import/export function (and that doesn't really depend on
how we actually store bookmarks), for supporting multiple formats as XBEL,
Nestcape-HTML, and even IE Favorites (yes, it would be a solution for all those
IE Favorites import problems).
Mike:
Just to state it correctly here, you're the _only_ one so far who proposed this
to be WONTFIX, also in the thread you mention.

And for storing data (and not web content), we shouldn't have ever been turning
to using HTML. That's NO data storing language!

So, whatever it turns out, it should be a data format, and most people agree
that it should be XML. If it ends up being RDF or XBEL or whatever, it can't be
HTML, if you really think about it (and, no, using the wrong format for years is
no argument here - why should I ever change away from any format that's being
used for ages? why should I turn from IE to any other browser when I've been
using it for years?)
No I'm the only one who happen to know enough about RDF AND XML formats
techincally that are _bothered_ to give a reason defending the use of HTML in
that thread. No one has given a good enough reason to store the thing in RDF, if
you going to propose the use of RDF or some other attempt of 'XML'fy the
bookmark why don't you defend it in the thread? Why is html the "wrong" format?
Quite frankly there will be a lot more pissed users when their discovered that
their bookmark is turned into some machine human unreable language that they
can't open than there will ever be defending the use of RDF. It's a storage
format, no one is suppose to parse the thing except mozilla and when it does
it's presented in RDF. If you propose the few years or month back I would agree
with you, but learning enough about RDF and 'XML format' over the years told me
otherwise. Get over it...
Re: Comment #7 From Mike Lee  2002-11-04 15:59

> No I'm the only one who happen to know enough about RDF AND XML formats
> techincally that are _bothered_ to give a reason defending the use of HTML in
> that thread.

Well it appears that you're wrong ;-)

> No one has given a good enough reason to store the thing in RDF,

AFAIK, there's one very good reason, performance. As we handle it as RDF
internally anyway (AFAIK), there's no point in converting it back and forth all
the time.

> if you going to propose the use of RDF or some other attempt of 'XML'fy the
> bookmark why don't you defend it in the thread? Why is html the "wrong"
> format?

- HTML was never intended for this kind of data storage. It was always intended
for creating links ("hyperlinks") between semantically similar pages. Netscape
happened to have thought that one could see a bookmarks files as an unordered
list of such links, but that doesn't justify usage of HTML today any more, really.

- There is no version of HTML that has the elements used in Mozilla's bookmark
files. Have you actually seen the source of those files? It doesn't quite feel
like HTML! Ex.:

<H3 ADD_DATE="961102203" ID="NC:BookmarksRoot#$b742f58">Developer Information</H3>

- The advantage of using HTML for the bookmarks file is that I can open it in
any HTML reader and it will display just fine. Neat thing, but I can as well
export my bookmarks file to real HTML and it'll work as well.

> Quite frankly there will be a lot more pissed users when their discovered that
> their bookmark is turned into some machine human unreable language that they
> can't open than there will ever be defending the use of RDF.

I've never seen "I can human-read my bookmarks file in Mozilla" as an argument
for Mozilla and against Opera, Internet Explorer, etc. Do you really think that
many users will actually go to their profile directory (most people don't even
know where it is, just lurk in #mozilla for a week and watch people ask) for to
be able to read their bookmarks?

> It's a storage format, no one is suppose to parse the thing except mozilla

Exactly!
I must be sleeping when I was taugh data structures. Can you tell me how the
hell the performance will benefit through parsing a larger, more complex file
into memory? Not being sarcastic or anything, I think I really miss that point.
It's been bugging me since someone first mentioned it. 

About your html critism, RDF is never designed as a bookmark data storage
either. Have you ever seen innerHTML in a html spec? You can export RDF if you
want too.

If no one is ever going to touch the file why convert it to RDF? Waste of
engineering effort with little to no return. Seriously unless there is a real
'performance' benefit to using RDF there is no reason to save it as an RDF file
that give no benefit at all. At the very least people can open the html version
using any browser as well as the many bookmark tools available.
Mike:
Your only argument seems to be that the file is larger and that can't help
performance, others think that it's wrong (and some of those others are really
Mozilla and data storage experts, look into the pasted paragraphs of comment #0).

But your argument doesn't count much as long as you can't prove it. So I propose
you make up a patch to store RDF, profile it, and give us some real numbers
about loading/saving times and file sizes, else I don't think we can much
believe you. And I simply don't believe people that are declaring themselves
experts and can't prove it other than repeating their arguments.

I may not be an expert as well, but there are lots of other people who tell the
same arguments as me, soe of them being real experts who can prove that fact.
I quite respect Sören and I would like to see him provide a explaination on why
the RDF file will perform better. I'm was looking at the source code of the
bookmark after reading his comment and I can't see how the performance would
increase. I be interested to hear Ben Goodger's opinion (on performance, which
wasn't talked about in the comment you quoted) and Chris Waterson (sorry if I
shouldn't of cc'ed you).

Heh you want me to make a patch to store it in RDF and prove it? I think it
should be the other way around, I'm AGAINEST changing code. If you want to
change the code YOU should be the one to come up with the numbers. Not to
mention I never declared myself as expert, I just happen to know some
programming principles.

By the way I provded a lot of argument. I even provide a solution of using
another RDF file if you want to be more flexible. See we on a thin line here,
the only argument that RDF has going for it is this "performance" which you
never backed up in the newsgroup. 

I have no intentions to make enemies, see outside of the mozilla project itself
mozblog is probabily the biggest consumer of RDF out of all the addons. There is
no reason why I would be againest further development of it unless I see
something wrong with it. In this case, I think it's wrong. If RDF do in fact
give better performance, then I rest my case, you can hack away with this new
RDF bookmark format.
Mike:
My real argument for this isn't performance - though I think it would be nice if
it was a performance improvement as well.
My real arguments are as follows:
1) Despite what you said above, HTML _is_ the wrong format, because:
  a) a "standards-compliant browser" should _never_ write out a document that
looks to be some official standard but doesn't validate against any of the
versions of this very standard. And I didn't find any official HTML standards
paper that makes the Mozilla bookmarks file a valid HTML document.
  b) HTML is a web document format, "the publishing language of the World Wide
Web" (see http://www.w3.org/TR/1998/REC-html40-19980424/), used to "represent a
hypertext document for transmision over the network" (from the first ever HTML
draft,
http://www.w3.org/History/19921103-hypertext/hypertext/WWW/MarkUp/MarkUp.html),
not a format to store a bookmarks tree along with bookmarks data, such as last
visited date, add date, icons etc.
  c) Our current HTML format is very hard to extend, and therefore it's
unnecessarily complicated to develop new features. See Ben Goodger's words from
2001-02-14 in comment #0
  d) It's not necessary to convert data to a completely different model just to
store it for internal use. It's good if we can convert that, but that's
import/export fuctionality, not storing functionality.
2) Many people would love to see some XML format because that's very very easy
to use/convert by other applications. You can even just use XSLT and create a
different XML format, even [X]HTML from it. (see some comments in bug 55057)
3) XBEL might have some restrictions when it comes to the argument in 1)c) - I
don't know it good enough though to know how exensible the standard is there, we
still might need a conversion of the logic of our internally stored data for it.
4) We already store the data in an rdf model in memory, so we need no converting
of the logic to just serialize that RDF. Additionally, we can extend the RDF as
we want and we'd have a well-defined API, what would make it much easier to work
with for 3rd party developers.

If file size is a concern, we might eventually think about compressing the file
using an internal gzip or something (as we already have at least the un-gzip
part in Mozilla).
The argument that 3rd party tool might rely on current HTML format is no
argument for me, as we'd still provide exporting to that format, and new 3rd
party tools that use our XML-based, eventually RDF, format would evolve quite fast.
Additionally, it would be really easy to convert to the Netscape-HTML format as
well as compliant HTML, XBEL or anything else with our built-in XSLT engine...
I already answer most of these question, I'll do it here again for clarity

1a. Define standard, there is simply no standard for bookmark storage.
1b. Look at the bookmark file doctype, it certainly doesn't declare itself as an
html format
1c. Extension could be made through the use of RDF resource subject as I
described in newsgroup
1d. Do you mean we need to convert the history database into RDF as well since
we use RDF model?
2. I thought we not suppose to act on the document directly and should go
through mozilla. So why does it 

matter what format it is 'stored' in?
3. Not sure why you talking about XBEL in a bug about RDF
4. What logic? It's writing the entry one by one out just as you walk the tree.
We already have a 

well-defined api that is used by 3rd party developer like me.
5. So size is definately a negative point right now.

We talking about the storage of a bookmark here. Anything to do with using and
format conversion doesn't apply here. Because they happen when the bookmark is
read into memory and exposed through a RDF api. We can do all that conversion we
want there. That left us with a few things. Flexability, "Standard", and
performance.

Flexability, I already mentioned in the newsgroup how to extend the bookmark
_without_ bloating it. Can you imagine the size of the bookmark if people keep
adding additional data field to it? My solution allow the extension to
independant of the bookmark yet still allow you to access it as if it's one
datasource. 

Standard, how I love this topic. This is something people including myself keep
getting caught. The fact is netscape bookmark file format is the closest thing
to a standard bookmark format as you can possibly get. It is certainly the most
widely understood bookmark format. It doesn't declare itself as a valid html or
even just html format. Quote "<!DOCTYPE NETSCAPE-Bookmark-file-1>", it's a
netscape bookmark file. I must admit I only realise that myself, but it sure is
a good one. It uses html tags, but I don't recall using the brackets mean it
must be a valid xml or html (is angle bracket trademarked?). 

Just as Mozilla store mail in mbox format, history in mork format, pref in
javascript format, calendar in vcalendar format, cookie in tab delimited format,
mozilla store bookmark in netscape bookmark format. Note half of the stuff is
not a 'standard' way of storing their respective data because there is NO
standard just like bookmark. Quite frankly Netscape bookmark format is the
defacto standard.

This leave the only advantage of using RDF is for performance. Performance is
pretty much still in the air until someone much more knowledgable about the
bookmark backend come and make a comment.
Mike, I'll ingore all your further comments (and I think you can inogre mine as
well in the meantime), until I hear some words of people that have real
knowledge of Mozilla's bookmarks system and/or can provide real data about
what's going on. we're just repeating arguments, no new findings, other than
that you're the only one supporting your arguments, and I can state that some
others think the same our similarily.

Just some last comments:
> 1a. Define standard, there is simply no standard for bookmark storage.
I just said we write "NETSCAPE-Bookmark-file-1" into a .html file (that's
belived to be HTML - as in "a W3C standrds compliant file" by almost anyone I
speak to), and "NETSCAPE-Bookmark-file-1" is a proprietary format, no open
standard. An Open Source browser should not write such a format. We're NOT Netscape.
> 1d. Do you mean we need to convert the history database into RDF as well since
> we use RDF model?
It doesn't have to be RDF, it should be an open, preferably XML format.
The current history format is bad as well, it just doesn't matter that much
because I don't think people would want to synchronize history with other client
or such things. Many more people want to have access at bookamrks than at history.
> 2. I thought we not suppose to act on the document directly and should go
> through mozilla. So why does it matter what format it is 'stored' in?
I never supposed 3rd party applications should always contact a running Mozilla
(it might quite often be not running) when wanting to access bookmarks.
> 4. What logic?
At least for me, there's a big logic difference between RDF triples and those
"stuffed" Netscape-"HTML"-format-lines in some <DL> "tree"(!?!) stucture and
strange <H3> tags.

But anyway, too much said. We're repeating arguments, as I said. And no bug
report or newsgroup is thought for endlessly repeating arguments.
Summary: Store bookmarks in RDF format → Store bookmarks in a RDF format
I can speak knowledgeably about both bookmarks as well as RDF in Mozilla.



Using the (historical) HTML format has various advantages for the browser over
using RDF.  Instead of focusing on "why HTML?" I will go over "why not RDF?"

Basically, RDF files are significantly larger on-disk and take longer to parse.


As an example, a *small* "bookmarks.html" file (containing twenty-nine bookmarks
and one folder) is 6,829 bytes. Serialized to a "bookmarks.rdf" uses 22,221 bytes.

That's a 69.3% increase in on-disk size.

In terms of parsing, Mozilla can parse the HTML file in 18,802 microseconds.
To parse the equivalent RDF file takes 37,352 microseconds.

That's a 49.6% increase in parsing time.

The numbers speak for themselves.


Hardware used:
[
Machine:     Power Macintosh, dual-CPU'ed 800 Mhz PowerPC G4
RAM:         512 MB of RAM
OS:          Mac OS X 10.2.1
Hard drive:  80 GB Seagate Barracuda ATA IV, 7200 RPM, 9.5 ms seek]
]


Could the RDF parser be improved?  Sure, why not.  Could the HTML parser be
similarly improved?  Sure, why not.


IMHO,  Mozilla *should* support at least exporting bookmarks as RDF for those
individuals inclined to want to externally process their bookmarks via an RDF
format. (Indeed, I've written eactly that code to help generate the numbers
above. Perhaps, with good fortune, it will make it's way into the tree.)
How about it Robert? I think Sören is a bit busy to reply my question and
Waterson turned off bugzilla mails, doh. Proposing wontfix and open a new bug on
exporting RDF instead. I am very passionate about RDF, but storing everything in
RDF just don't make sense. It's a format to exchange data, not storing them.
Mike, *exporting* to RDF doesn't make sense. Have you met a browser that imports
RDF? Most likely not.
Sören: while it is true that no browser current supports importing RDF, various
people have asked over time to be able to export their bookmarks to RDF for
external processing (server apps, etc).
Sören I said export because one of the arguement people have for storing
bookmark in RDF is so other apps could use it. I guess I added that comment
because I don't want this bug to 'morph' into a RDF exporting bug. By the way,
thanks for the measurements Robert. (just realised theres two Robert, my
previous comment is refering to Robert Kaiser sorry) 
Hardware: PC → All
I've opened bug 177886 for tracking exporting of bookmarks to RDF.
Robert do you really mean bug 177886 (thats this bug...)? I couldn't find the 
export to rdf bug.
Oops, I meant bug 180423
I think everyone has missed the point of the original bug (55057). The user wanted XBEL because people hope to make that a cross-platform, cross-browser standard for bookmark storage, not for performance gains. Compare to ACAP (http://asg.web.cmu.edu/acap/index.html). What users want is ONE place to store bookmarks that ALL browsers use. I use konqueror, mozilla, IE (sometimes), and phoenix. I therefore have four different sets of bookmarks. Import/Export is fine, but I hate that I can't just bookmark something and then have it be there when I open a different browser. Thus, cross-platform/browser standards (which  I realise XBEL is not, but standards are made by being supported). Maybe you will tell me to just pick one browser, but sometimes I need to use windows (to play a game) and I like konqi for when I'm upgrading/tweaking phoenix and I've nuked my install -- still need a browser, but phoenix isn't there anymore. I don't think I'm alone in using several browsers. Yes, sharing across machines would require leaving the file on a server and having the client connect to it, but on my machine where I do 99% of my work, having a single bookmark file would be a godsend. 
Ok, really sorry about the lack of wrapping. Should I use hard returns? 
(In reply to comment #23)

I fully agree. Maybe that's a task for freedesktop.org? :-)
Maybe, we should make this bug dependent on bug #11050?

http://bugzilla.mozilla.org/show_bug.cgi?id=11050#c15

Storing Mozilla's user data in a standardized way into a local SQL database
should provide a MUCH better way to share bookmarks data and other data. Even
the disk cache could be shared between KDE, Mozilla, ...
Product: Browser → Seamonkey
Assignee: bugs → nobody
QA Contact: claudius → bookmarks
places (therefore sqlite) is the new storage medium
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.