Closed Bug 66334 Opened 24 years ago Closed 24 years ago

view source needs to be a protocol handler

Categories

(Core Graveyard :: Embedding: APIs, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9

People

(Reporter: jud, Assigned: chak)

References

Details

Attachments

(9 files)

Currently view-source is implemented by putting the docshell into a view-source mode and some other trickery. we should have a view source protocol handler ("view-source") which fetches url data (hitting the cache if possible) and returns the content type as text/plain (thus it will not be treated as html). This will remove the view-source hacks currently exposed, and then anyone can "view source."
Jud, One benefit of the current "trickery" is that it can embellish the source with color/highlighting and other nice things. That got turned off or removed at some point, but it used to render the source much prettier. I'm not sure you could do that in a protocol handler (it used to be in the parser because it required some knowledge of html source coding conventions). How about a generalized mechanism that let's one override the content-type? Then we could offer features that let the user force a given URL to be displayed as plain text (or plain text as html, whatever), in cases where the server is misconfigured/confused?
> view source protocol handler ("view-source") Is that an officially approved scheme? If not, please use the private namespace. > which fetches url data (hitting the cache if possible) and For the record: Please pay attention not to worse bug 55583 (i.e. don't make it harder to fix that bug). > How about a generalized mechanism that let's one override the content-type? Nice idea.
OS: Windows 2000 → All
bill: colorization would likely be lost unless we passed some type of cookie around in the content type. maybe view-source would be "text/plain/source" or something. the parser (or whoever's doing the colorizing) could key off of that instead. Ben: pick a scheme out of a hat. IMO, if we do colorizing (lower priority) based on content type, and make view-source a protocol handler, 55583 gets fixed for free.
> text/plain/source Use paramemters: type/plain formatting="source" (or similar, not sure about the exact syntax)
you're right. sorry: "text/plain; fmt=source"
Not quite. Read RFC 2646. If we need a proprietary attribute for this value, we should probably prefix it with something like "moz-". (I don't know of any convention for naming vendor-specific values to media type attributes.)
cc to self
Is there some reason why we couldn't just emit text/html to support colorized text? It seems as if it should be up to the view-source protocol handler, whether it just spit out the source as 'text/plain' or ran it through the parser to gather more contextual info and generated text/html to 'pretty print' the source... -- rick
Hardware: PC → All
hmmm, wouldn't that modify the source itself (whitespace, etc) so a "save as" wouldn't work well?
I interpreted Jud's initial comment to mean that he wanted to replace *all* the special view-source "trickery" with a new protocol handler. I just wanted to point out that not all the trickery (code for which might still be there, for all I know) could go into the protocol handler because it would have to know about HTML syntax, etc. But now it seems that some "trickery" will remain outside the protocol handler (if necessary, to do any special formatting). So the issue boils down to: 1. Do we move the code that directs the parser/layout to do "view-source" (whatever that might entail) from the docshell to a new protocol handler? 2. If so, how do we implement that protocol handler (i.e., how does that protocol handler then drive the parser/layout to render the source)? Is that a reasonable analysis? The reason for asking ourselves the first question has not been stated explicitly. I suspect it has something to do with the fact that nsIDocShell isn't publicly "exposed" and therefore embedding clients wouldn't have access to "view-source" if it were provided via that interface. Is that correct? Using a protocol handler instead of the docshell view mode sounds reasonable. But I wonder if it won't lead to some unpleasant complications. For example, what if one is viewing text/plain and chooses to "view-source" on that? It seems like the original content type would still be pertinent, no? It looks as if the doc shell already supports "view-source:" as a prefix on the URL (http://lxr.mozilla.org/seamonkey/source/docshell/base/nsDocShell.cpp#3090). Why not just leave that alone and pretend we've already implemented the new protocol handler? Lastly, I don't quite see how resolution of this impacts bug 55583. That bug is about the fact that we don't always keep the page source around (and even if we did, there's no way to ask for that saved version to be loaded). Jud, are you implying that the "view-source:" protocol handler would take advantage of some new and otherwise unavailable Necko mechanism to get at the always saved source? If not, then I think we'd still be stuck in the same predicament. Namely, that sometimes viewing the page source hits the server again.
bill: 1. yes. 2. all it does is pass the source directly into the parser as text/plain, it doesn't do any colorization (we don't support this anyway), or source manipulation. it's is simply _view-source_. Your assessment of why we'd want to do this is correct (among other things). text/plain view source == text/plain view source. they're the same thing. it impacts 55583 in that if we never modify the source, then it's obviously the *plain* source, and 55583 is nullified. as a protocol handler, it could leverage the copy that necko cached. this would imply we always had a cache hit for view-source which may not be valid. I guess that would indicate that the parser should be able to re-generate (or keep a copy around) the original source file. hmmm, maybe the view-source handler (or someone capturing view-source: urls) just tells the parser to cough up the original document. maybe each docshell maintains a copy of the original, top level, content it's displaying, and it calls docshell::loadstream() with that content when we want to view source?
valeski, the cache does not always hold a copy. That's the problem in that bug (and similar SaveAs etc. bugs).
I assume that save-as is an orthogonal issue... I don't think that you would want to save the 'view-source:http://foo...' URL. Rather, you would want to save the 'http://foo...' URL instead.
save-as view-source:url sounds reasonable to me, if that means that a normal save-as url results in saving rendered source. maybe we need to resurrect the wysiwyg protocol. I really should learn how to convert my modification of the dom dumper into a js protocol handler.
See also bug 15372.
Hmmm... catching up on this bug but hasn't pollmann already written viewsource? :)
Who, me? The only protocol I've written is wyciwyg, and that's not checked in yet... :)
doh! wrong number.
Blocks: 70229
Marking future for the moment.
Target Milestone: --- → Future
-> chak who's working on this. I'm resetting the milestone to -- so chak can assign a new one.
Assignee: adamlock → chak
Target Milestone: Future → ---
Blocks: 72836
Setting target milestone
Target Milestone: --- → mozilla0.9
I've been looking at fixing this bug and noticed that some of the comments in this bug are out of sync with today's state of things as far as colorization is concerned. We do have code to display colorized source, so it does not make sense to move it to the realm of a view-source protocol handler. Here's the general approach i'm taking: 0. Remove the code which checks for the "view-source:" prefix on URLs in nsDocShell.cpp and puts itself into a view-source mode. We'll use a protocol handler instead. 1. Write a new viewsource protocol handler which internally uses http to fetch the page source 2. Initially the view-source proto handler was giving out a mime type of the form: "<orig_mime_type>; x-disp-fmt=view-source" where <orig_mim_type> = the original mime type the HTTP channel gave us, so this could be text/html,text/xml or whatever. The intent here was not to lose the original mime type. Doing this i discovered that we need to change code in a lot of places to handle document/viewer creation etc. for this "new" mime type. (I could use some help here on how to fit this new mime type into the existing scheme of things without changing too much of what we already have) 3. Instead of emitting a modified mime type (of the form "text/html; x-disp-fmt=view-source") the view source protocol handler just returns the original mime type as it got from the http channel. Since the docshell needs to differentiate between the text/html which came from a view-source protocol handler Vs the regular http handler i do the following (this is mainly needed to determine whether to create a view or a view-src viewer): // in nsDocShell.cpp... nsresult nsDocShell::NewContentViewerObj(const char* aContentType, nsIRequest *request, nsILoadGroup* aLoadGroup, nsIStreamListener** aContentHandler, nsIContentViewer** aViewer) { // We need to create a view-source type viewer if we can get // nsIViewSourceChannel from the request nsCOMPtr<nsIViewSourceChannel> viewSrcChannel = do_QueryInterface(request); if(viewSrcChannel) mViewMode = viewSource; // Once we have the proper viewMode everything we currently have will // just work // Nothing changes from here on down in the existing code char id[256]; PR_snprintf(id, sizeof(id), NS_DOCUMENT_LOADER_FACTORY_CONTRACTID_PREFIX "%s;1?type=%s", (const char*)((viewSource == mViewMode) ? "view-source" : "view"), aContentType); ... ... ... ... } Doing it this way resulted in minimal changes to the existing code. Comments?
the mime type emmitted from the view source channel should be "text/plain; <orig-type>" (rather than "<orig-type>; <some-view-source-type>") then it will work (modulo colorization of course). Text colorazation doesn't work today, so lets not try and address it here (someone file a separate bug if they want). The docshell shouldn't have any knowledge about view-source (and won't need it when the content-types are switched); view-source is just a protocol handler. if you change the mime-type around as I suggest, life will be good. You can even leave out the "; ..." part for now for simplicity, I don't think it would be propagated properly anyway, given the current usage of GetContentType() calls.
> Text colorazation doesn't work today The mozilla build i have shows colorized HTML when i do a View/PageSource. Is there more to it (w.r.t colorization) than what we currently have? > If you change the mime-type around as I suggest, life will be good. It does work when i emit text/plain as you suggest. However, whatever colorization we had earlier (before this mime type switch) is lost.
*** Bug 74855 has been marked as a duplicate of this bug. ***
*** Bug 60224 has been marked as a duplicate of this bug. ***
my NS build (week old) doesn't support colorization. if we have colorization, it's spotty, and we'll break it w/ this change. that's fine w/ me because we're changing the architecture for view-source; things break w/ arch changes.
View Source colorization has been supported for a long long time now. It's a pref, and it was off by default until some very recent builds. Try setting: user_pref("browser.view_source.syntax_highlight", true); to see this. In newer builds the pref is "view_source.syntax_highlight", is on by default, and has UI. Colorization is done by inserting <span> nodes into the document in the parser and then styling the spans. So I would guess the content-type for view source needs to be text/html for the colorization to work...
yes. the current colorization model requires content viewer interjection. we can manipulate the viewer used in the docshell to get this affect. However, the view-source protocol handler approach doesn't currently support colorization its primary goals are: 1. create view-source content that is *exactly* what the server sent back (there are bugs in the current model which manipluates the actual content by inserting various things (as you point out)). 2. remove docshell knowledge of view-source special casing (view-source "mode"). Colorization is a feature that will indeed break w/ this new model. It's give and take. We either fix the two things above w/ the new model (and break colorization), or we leave them for the sake of the defaulted off colorization. IMO, we're fixing real bugs in favor of a defaulted off "feature." regarding the last code change Chak posted in the docshell. QI'ing for the sake of determining "what" something is (and then doing nothing w/ it) is generally overkill (and introduces stub/meaningless interfaces). You should be able to make that last code fragment work by using the "<orig-type>; view-source" content type model, then GetcontentType() on the channel and look for the "view-source" string in the content type. Just a cleaner way to make the decision.
Before we go ahead and break colorization (which is on by default as of a few days ago, by the way) with this architecture change, we should consider whether there is a viable way to still do colorization once this change is made. It does not have to be the current method, but a method should still exist. The two things this architecture change fixes should indeed be fixed, but view source colorization is one of Netscape/Mozilla's best features...
>regarding the last code change Chak posted in the docshell. QI'ing for the sake >of determining "what" something is (and then doing nothing w/ it) is generally >overkill (and introduces stub/meaningless interfaces). You should be able to >make that last code fragment work by using the "<orig-type>; view-source" >content type model, then GetcontentType() on the channel and look for the >"view-source" string in the content type. Just a cleaner way to make the >decision. That's exactly what i did first (please see my posting from yesterday where i mention this and ask for help) but that check needs to be done in more than one place - i'm not really sure where all since the contenttype seems to be checked for at several locations. I know for sure it needs to be done in nsDocShell::NewContentViewerObj() and here's what i originally had: nsDocShell::NewContentViewerObj() { const char *pContentType = aContentType; // Check ContentType to see if it's a view-source type // If it's a view-source: ContentType will be of the form // // text/html; x-view-type=view-source // nsCAutoString strContentType; strContentType.Append(pContentType); char *cPtr = (char *) PL_strcasestr( strContentType.get(), "; x-view-type=view-source" ); if(cPtr) { mViewMode = viewSource; *cPtr = '\0'; pContentType = strContentType.get(); } ... ... }
after talking w/ chak about this. we'll go the protocol handler route and emit the "<orig-type>; view-source-foo-moniker" and create a content viewer for the load. this will maintain colorization, and leave all the content manipulation bugs. The primary gain from the view-source handler will be the removal of the "view-source:" check hack at the beginning of CreateFixupURI. it will also allow for arbitrary view-source handler replacement. if we want to address the content manipulation bugs by emittingn text/plain, we can easily throw that switch (we could make it a pref right from the outset).
Judson, that sounds like a good compromise. We could make the current view source colorization pref be the pref for this, perhaps? A question. I am not sure what it means to make view source be a protocol handler. Does this mean that it will be possible to, say, make view source open in external apps more easily? If so, then the colorization point is semi-moot -- people who want colorization can make view source open in an external app of choice that does highlighting.
we'd probably want to use a new pref name. one of the things view-source being a protocol handler gives us is mime type manipluation capability which allows content handling decisions to be made more easily/correctly. In an embedding situation, an external app could say it preferred the viewsource content type and handle it. Unfortunately, because the docshell says it will handle this type, we won't ever kick it out the true external application handling code. we could add a pref that would enable/disable mozilla view-source handling, that that would give external apps a crack at handling it instead.
After various discussions with Rick, Jud and Vidur here's the approach we're currently taking: 1. ViewSource handler emits a ContentType type of "<orig_type>; x-view-type=view-source" where <orig_type> is the original content type returned from the server to the ViewSourceChannel i.e. it will be something like "text/html", "text/xml" etc. 2. Change docshell so that it knows nothing about viewsource mode etc. 3. Change nsLayoutDLF so that it can recognnize content type of the form "<orig_type>; x-view-type=view-source" to create viewers for doing "view-source" with "<orig_type>". 4. Fix viewer, mfcembed and mozilla to use the new "view-source:" urls rather than setting the docshells mode to viewsource(note that docshell does not have the "viewMode" attribute anymore) Patches forthcoming....
Attached patch Patch to layout (deleted) — Splinter Review
Attached patch Patch to docshell (deleted) — Splinter Review
Attached patch Patch to netwerk (deleted) — Splinter Review
Attached patch Patch to mfcembed (deleted) — Splinter Review
Attached patch Patch to viewer (deleted) — Splinter Review
Blocks: 74486
Blocks: 74862
layout - * I'm not direct string buffer maniplulation is a good idea. some string impls might be sharing the buffers, and someone manipulating one, could cuase another use to fail. I'm not up to speed though. * the logic looks good though. docshell - * r=valeski netwerk - * we'll need to add the new dir to the mac. * nsViewSourceChannel::Init() - you can avoid all the stringification if you just pass the "uri" arg into nsIIOService::NewChannelFromURI(); * cancel/suspect/resume should pass down to the transport. * VSChannel::Get|SetOriginalURI() - shouldn't you be caching the channel's URI, and handing it back here? * ::GetContentType() - rather than constructing the content type evertime this is called (an arbitrary number of times), you should cobble the content type together in your :OnStartRequest() callback (the earliest possible point at which you can get at a valid content type). You should also set mContentType to UNKNOWN in your init method. * ::SetContentLenght() - pass this through to the underlying channel? viewsource.js - * r=valeski mfcembed - * r=valeski viewer - * is the aURL.get() needed? I think you can just pass in aURL. what about the other embedding apps and mozilla?
layout - > * I'm not direct string buffer maniplulation is a good idea. some string impls > might be sharing the buffers, and someone manipulating one, could cuase another > use to fail. I'm not up to speed though. > * the logic looks good though. Now we do not directly mess with the internal buffer - use the public methods which do the same: nsCAutoString strContentType; strContentType.Append(aContentType); PRInt32 idx = strContentType.Find("; x-view-type=view-source", PR_TRUE, 0, -1); if(idx != -1) { // Found "; x-view-type=view-source" param in content type. // Set aCommand to view-source aCommand = "view-source"; // Null terminate at the ";" in "text/html; x-view-type=view-source" // The idea is to end up with the original content type i.e. without // the x-view-type param was added to it. strContentType.SetCharAt('\0', idx); aContentType = strContentType.get(); //This will point to the "original" mime type } netwerk - > * we'll need to add the new dir to the mac. Will talk to Conrad > * nsViewSourceChannel::Init() - you can avoid all the stringification if you > just pass the "uri" arg into nsIIOService::NewChannelFromURI(); We can't since the URI at this stage has a "view-source:" prefix. We need to strip that before creating an http or whatever channel - this is what the keyword: channel does. > * cancel/suspect/resume should pass down to the transport. How do i do this i.e how do i get the underlying transport interface etc.? > * VSChannel::Get|SetOriginalURI() - shouldn't you be caching the channel's URI, > and handing it back here? Currently i'm not cacheing this and just redirecting these methods to the underlying channel since they already handle/cache this data > * ::GetContentType() - rather than constructing the content type evertime this > is called (an arbitrary number of times), you should cobble the content type > together in your :OnStartRequest() callback (the earliest possible point at > which you can get at a valid content type). You should also set mContentType to > UNKNOWN in your init method. Will make these changes > * ::SetContentLenght() - pass this through to the underlying channel? Done > viewer - is the aURL.get() needed? I think you can just pass in aURL. Not needed. Removed the get(). [On a side note i still have'nt got a feel for how to use these (numerous) string classes we have here. Everytime i need to do something (however simple) i 'm having to search the source for it's usage and everyone's using it their own way] > what about the other embedding apps and mozilla? Thought i covered all of them. What else does view-source?
cancel/resume/etc - * you can just call the underlying channel's version of these. you're right, you covered all the apps; sorry about that.
Updated Patches based on jud's comments forthcoming: patch.layout.1, patch.netwerk.1, patch.viewer.1 will follow...
BTW, can someone help me with adding the new "viewsource" dir for the mac since i do not have access to a mac?
- layout * r=valeski - nsViewSourceChannel.cpp * ::Init(). You can just remove + const char *protocolSpec = path.get(); + if (!protocolSpec) return NS_ERROR_OUT_OF_MEMORY; and pass "path" directly (no .get() or casting) into ->NewChannel(). That'll remove the local var. * nit. NS_WITH_SERVICE() has been depricated in favor of do_GetService(contractID, &rv); * GetContentType() changes haven't been made. - viewer * r=valeski
after talking to chak, we'll leave the contenttype stuff as is in the most recent netwerk patch (for consistencies sake (other channels do the same) and to avoid housekeeping in OnStartReq()). r=valeski on netwerk.
Leaving GetContentType() as is per Jud's comments above. However, i fixed the other two issues Jud brought up: 1. ::Init(). You can just remove const char *protocolSpec = path.get(); if (!protocolSpec) return NS_ERROR_OUT_OF_MEMORY; and pass "path" directly (no .get() or casting) into ->NewChannel(). That'll remove the local var. 2. Use do_GetService() instead of NS_WITH_SERVICE
sr=rpotts. Is the nsIViewSourceChannel interface still needed? If so, then the comment above the getOriginalContentType attribute in nsIViewSourceChannel.idl is wrong - it says that getContentType(...) returns 'text/x-view-source'
Though not currently being used by anyone i decided to leave the GetOriginalContentType() in there since we may find a use for it after the nsDLF re-org Jud's working on. I fixed the comment error in the idl file. Thanks for catching that.
These changes are now checked in. Thanks you all for all your suggestions and help in getting this resolved.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
This is nice! I should change the js console to use viewsource: instead of opening the view source window directly.
Moving API bug ownership to David Epstein.
QA Contact: mdunn → depstein
Correction: Changing QA contact for the Embed API bugs to David Epstein.
typed "view-source:http://www.xxx.com" into the URL bar. Page source views fine.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: