Closed Bug 12579 Opened 25 years ago Closed 25 years ago

Implement jar: protocol (tracking bug)

Categories

(Core :: Networking, defect, P3)

All
Other
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: warrensomebody, Assigned: warrensomebody)

References

Details

(Keywords: perf, Whiteboard: [pdt+])

We need jar: protocol support for several reasons: - performance (accessing lots of small files from a jar should be faster) - downloadable chrome - general product packaging There are a couple of pieces to this: - implementing jar: URLs (page 16 of the JavaHelp spec <http://java.sun.com/products/javahelp/spec-1.0.pdf>) - implementing the jar protocol handler (to read and write entries -- might look a lot like the file: protocol) - upgrading the current tree to make use of jar files, fixing xul, makefiles, etc. (need a separate bug for tracking this)
Target Milestone: M11
Blocks: 12833
Whiteboard: [Perf]
Putting on [Perf] radar.
Blocks: 12838
No longer blocks: 12833
*** Bug 4707 has been marked as a duplicate of this bug. ***
Target Milestone: M11 → M13
Assignee: valeski → mstoltz
At this point, I'm writing the protocol handler and Gayatri is writing the jar:URL parser.
Here are some thoughts on how I think some of this should look... URL parsing -- The jar: URLs should probably have 2 instance variables that point to other URLs, i.e.: class nsJARURL : public nsIURL { ... nsCOMPtr<nsIURI> mJARPath; nsCOMPtr<nsIURI> mRelativePath; }; where the syntax is "jar:<mJARPath>!/<mRelativePath>". In other words -- we shouldn't need to write a complicated parser here, just something that looks for "jar:" and "!/" and delegates the rest to nsStdURL. JAR protocol handler -- I'm about to check in changes to nsFileTransport that abstract out the part that really opens/closes/reads/writes files from the state machine that deals with asynchronous activity, suspend/resume/cancel, etc. You should be able to use this to implement the JAR protocol. Basically, there will be a new interface, nsIFileSystem that you'll have to map to libjar. Once you do that, I think it will just work. I'll let you know when it goes in (today hopefully). Please come by if either of you have any questions. I don't want us to work in a vacuum on this stuff. We should try to keep in close contact to make the quickest progress. Thanks for doing this! Warren
Just checking, but is <mJARPath> a URL in its own right? i.e. "jar:http://foo.com/my.jar!/some/stuff.html"? The original Sun spec we saw on this could handle archives embedded in other archives. We may not want to figure that out now, but lets use a non-greedy search for !/ so we can support more later. Moving resources into archives may be a space win (especially on FAT drives, even without compression), but it's not going to be a performance win if for each requested resource we open the archive, parse the directory, and *then* serve the data. We need some way to keep these archives open, and I don't know how we can do that using stateless URL syntax. Maybe this could be part of the magic of the "chrome:" protocol. Rather than have to change all existing chrome: URLs we just add to the magic where chrome: already adds "default" or local directories, it could open archives and eventually close them when the chrome system shuts down. Not sure how that plays with the jar: protocol idea, though.
Adding this functionality to chrome: is a less general solution, since other code besides UI could make use of a jar: protocol. In particular, I need it to implement signed scripts. Since in the first iteration, using libjar as is, we need to download a jar file to disk before extracting files from it, there should be a way to keep ahold of that file on disk for subsequent calls to that URL. This will improve performance.
Sorry, I didn't mean we should give up on the "jar:" protocol -- I really want that to try some cool ideas I have floating around. I just meant that a "jar:" protocol *by itself* is not going to solve the perceived chrome performance problem which I assumed was the impetus for this bug given the [perf] status. Maybe part of that solution would be for libjar (at a level below the protocol handler) to keep a refcounted table of open archives, and return references to them. That way, for example, simultaneous browser windows wouldn't cause the main resource .jar to be opened multiple times with the associated memory cost to hold the directory structure. Then the chrome/XUL system could open and keep a reference to its main archives (using nsIZip directly), and then the jar: protocol will find the archives already opened, saving much time. For other uses jar: still works, just isn't as optimized.
> Just checking, but is <mJARPath> a URL in its own right? i.e. > "jar:http://foo.com/my.jar!/some/stuff.html"? Yes, and I assume you can have cases like this too: jar:jar:http://foo.com/my.jar!/other.jar!/stuff.html This would imply to me that we have to look for the "!/" starting from the end and working backwards. Also note that we'll most likely need to pull the jar file down to the local drive so that we can access it with our libjar code. We can shortcircuit this for the case where the "mJARPath" part is a file: URL, but in general this means some sort of cache management code to clean up jar files after they're no longer needed (whenever that is). Whoever does this should sync up with Scott Furman to determine whether any of his network cache services will be useful in implementing this. Re. chrome URLs: I have this idea in the back of my mind that they can go away, being replaced by a more general search-path-based approach, probably subsumed by some change to resource: URLs. E.g. chrome://foo/bar.xul might become something like resource://ChromeDirs/foo/bar.xul. Obviously, jar files are one place on the search path you might want to look, so this resource: URL could again get translated into jar:file://<exe-dir>/chrome.jar!/foo/bar.xul.
Status: NEW → ASSIGNED
Blocks: 16654
Summary: Implement jar: protocol → [dogfood]Implement jar: protocol
Whiteboard: [Perf] → [Perf][PDT+]
Added dogfood to label along with PDT+ annotation. We noted that it was lower level infrastructure, and that it was listed as a high priority bug in a status report. If we are confused about this being dogfood, please email the PDT alias. Thanks.
I do not believe this is required for dogfood at all. Chrome will work fine without needing to be in jars. We may want it for beta, but we certainly don't need it for dogfood. I'd go so far as to say it may not even be needed in the shipping product for chrome... we should do it if we think it's a performance win, but if we get fast enough without it, then I wouldn't advocate doing the extra work just for chrome. If there are other reasons why we need it, then that's cool. I just wanted to give the chrome perspective.
Note that jar: URLs are needed for smart update. The resource/chrome URL idea is separate. However, I think that this sort of architectural detail should be ironed out now before we get too far down the road and entrenched with chrome URLs. I intend to investigate to see if there's something better/more general that we can do here.
Apparently, the large number of small files causes massive disk bloat on the Mac because the large filesystem block size is inefficient for small files. It's been reported that a Mac Mozilla install requires 220M on disk. So, even for chrome, I would hazard that jar support is a requirement in the shipping product. I don't see why it should be considered necessary for dogfood, however.
I agree that it's not dogfood, but I do think it's a porkjockeys/architectural/beta feature.
Yeah, sounds like we need it for the shipping product to reduce bloat at the very least.
Summary: [dogfood]Implement jar: protocol → Implement jar: protocol
I second that. We really don't want to ship a product with hundreds of .xul files, that's sloppy. I am removing the dogfood label, since this is not crucial for dogfood.
Whiteboard: [Perf][PDT+] → [Perf] needed for Beta
jar: URLS are not required for XPInstall (smartupdate). We need ZIP archive support, but we have enough in nsIZip already. Clearing the PDT status to match the clearing of Dogfood.
Blocks: 16950
Assignee: mstoltz → gayatrib
Status: ASSIGNED → NEW
Reassigning to myself.
Status: NEW → ASSIGNED
Blocks: 17432
Blocks: 17907
Blocks: 18433
Depends on: 18434, 18435
Blocks: 18471
Blocks: 18951
Blocks: 20203
Blocks: 21564
Bulk move of all Necko (to be deleted component) bugs to new Networking component.
Keywords: perf
Bulk add of "perf" to new keyword field. This will replace the [PERF] we were using in the Status Summary field.
Gayatri...is this feature stable enough to declare it FIXED? What else needs to be done?
Target Milestone: M13 → M14
Moving milestone to M14. I need to get some testing done.
Dan Veditz needs to let me know some information regarding cancelling a jar installation while extraction is going on. So I will file that as a separate bugand try to close the jar protocol asap.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
This still misses the cancel method implementation. Opening that as a separate bug. The rest of teh functionality is fine. To test it: 1) Please note that when you use local jar files, you need to type file:/// and not file:// in the browser window. An example local jar url would be something like: jar:file:///h:/testHtml.zip!/hello1.html 2) Example urls to test http and ftp downloads are: jar:http://www.relisoft.com/source.zip!/readme.txt jar:ftp://ftp.mall.net/wham131.zip!/wham.txt These are small files and can be easily downloaded and tested.
No longer blocks: 18433
Depends on: 18433, 24338
No longer depends on: 18433
Blocks: 18433
This really isn't done until all the dependencies are done. Reopening.
Status: RESOLVED → REOPENED
Depends on: 24765
No longer depends on: 24765
Depends on: 24765
Depends on: 24764
Clearing FIXED resolution due to reopen.
Resolution: FIXED → ---
On behalf of PorkJockeys: putting on beta1 radar, per beta criteria priority #2 - performance (esp. file I/O on Mac) is not within beta metrics. removing extraneous tags, cc waterson
Keywords: beta1
Whiteboard: [Perf] needed for Beta
what parts of jar protocol are working, and which ones not? does brutal sharing help?
Per warren: "Look at the dependencies, or more importantly, the dependencies of http://bugzilla.mozilla.org/showdependencytree.cgi?id=18433."
Per phil: "Warren, I thought part of the performance win was that we wouldn't need to open and close a zillion little xul, js, and dtd files in order to paint the chrome. But now that we have butal sharing, we probably only do that once, which means less performance benefit to jar files. Or am I off base?"
I don't know about a performance benefit, but a Nav-only install on a FAT drive takes over 50Mb because of all the small files.
Mac (HFS) has the same problem. Additionally, this feature has regressed: jar:http no longer works. I see 'Shortcut+' in the debug output, and then nothing happens, it fails silently. jar:file still works on Linux.
Whiteboard: [pdt+]
Clarification: as of today, jar:http fails silently on all platforms. jar:file works on all platforms.
Works for me. I tried this: jar:http://www.boulderdesign.com/Bin.zip!/chrome/global/content/default/about.h tml The initial try takes a long time to download (because it's big), the second one is really fast because it comes out of the jar cache. Mitch: Are you sure you were giving a valid jar: URL?
Warren, I tried this. It does not work for me as of now. I am trying to debug into it. What happens is that it goes into http::OpenInputStream(), it then comes into nsJARDownloadObserver::OnStopRequest(), and return from there. Then the browser url becomes the http site without the jar part and that's about it. Nothing happens after that. I am trying to debug this--just keeping you posted. I tried both with your url and also with: jar:http://www.relisoft.com/source.zip!/readme.txt (this is only a 28kb file--so the download should be pretty fast)
Works for me (jar:http://www.relisoft.com/source.zip!/readme.txt), on both my machines. What are you doing differently? Are you up to date?
This has become a tracking bug, the fix was already checked in. Moving to Warren. Selmer from Gayatri's machine.
Assignee: gayatrib → warren
Status: REOPENED → NEW
Summary: Implement jar: protocol → Implement jar: protocol (tracking bug)
I believe the actual jar: protocol is done enough for beta1, and keeping this as a tracking bug has become confusing so I'm going to close it. The remaining tasks are to use the cache manager to manage downloaded jar files (bug 24765) which depends on stream-as-file (bug 21250), and also implementing an open jar file cache for performance (bug 24764).
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
tracking bug, marking verified
Status: RESOLVED → VERIFIED
No longer blocks: 17432
No longer blocks: 17907
No longer blocks: 18471
No longer blocks: 18951
No longer blocks: 20203
No longer blocks: 21564
You need to log in before you can comment on or make changes to this bug.