Closed Bug 774914 Opened 12 years ago Closed 12 years ago

Reader Mode: Convert divs with only a <p> element child into plain <p> elements

Categories

(Firefox for Android Graveyard :: Reader View, defect, P2)

ARM
Android
defect

Tracking

(firefox16- verified, firefox17 verified, firefox18 verified)

VERIFIED FIXED
Firefox 17
Tracking Status
firefox16 - verified
firefox17 --- verified
firefox18 --- verified

People

(Reporter: aryx, Assigned: lucasr)

References

Details

Attachments

(2 files, 3 obsolete files)

Firefox for Android trunk nightly 2012-07-17, Android 4.0.4 (stock), Google Nexus S Opening http://www.computerbase.de/forum/showthread.php?t=1086596 and clicking the reader mode shows only one post of the thread displayed at the normal page, it's one in the middle: http://www.computerbase.de/forum/showthread.php?p=12377579
Blocks: reader
(In reply to Archaeopteryx [:aryx] from comment #0) > Firefox for Android trunk nightly 2012-07-17, Android 4.0.4 (stock), Google > Nexus S > > Opening http://www.computerbase.de/forum/showthread.php?t=1086596 and > clicking the reader mode shows only one post of the thread displayed at the > normal page, it's one in the middle: > http://www.computerbase.de/forum/showthread.php?p=12377579 This is probably because the readability algorithm is picking the post that has more text inside. To be honest, I wouldn't even offer reader mode in a page showing a list of posts. For the same reason that we shouldn't show reader mode option for pages like the Technology section at nytimes.com/technology
(In reply to Brian Nicholson (:bnicholson) from comment #1) > I encountered the same thing for this article: > > http://mobile.slate.com/posts/2012/06/22/ > new_law_threatens_mississippi_s_last_abortion_clinic.html This is a different case. Slate encloses each paragraph with <div class="text"> element which makes the readability algorithm think it has to pick only the div with more text. We should probably tweak Readability.js to make it convert divs with only a <p> element inside into plain <p> elements.
Summary: Reader mode shows only one post of a forum thread → Reader Mode: Convert divs with only a <p> element child into plain <p> elements
Assignee: nobody → lucasr.at.mozilla
Attachment #651904 - Flags: review?(bnicholson)
this doesn't look like a release blocker, and since there's already a patch forthcoming please just nominate for uplift when it's been on central and verified.
Comment on attachment 651904 [details] [diff] [review] Convert divs with only a <p> element child into plain <p> elements + if (e.tagName !== "DIV") + return false; We already check to see if we're in a div before calling this method, so do we need this? + if (childNodes.length !== 1) If a paragraph looks like this: <div> <p>[paragraph></p> </div> there will actually be 3 child nodes: the first whitespace Text node, the <p> element, and the second whitespace Text node. But I think we would still want to convert this to a single <p>. + return !this._getInnerText(e) && I don't understand - why check to see if the <p> has no text? I thought the problem is that slate uses paragraphs like this: <div><p>[paragraph]</p></div> and since the paragraphs are wrapped in <div>s, it's only using the <div> with the most text. If only empty <p>s are converted, these paragraphs would remain untouched, and the problem would still exist, right?
(In reply to Brian Nicholson (:bnicholson) from comment #6) > Comment on attachment 651904 [details] [diff] [review] > Convert divs with only a <p> element child into plain <p> elements > > + if (e.tagName !== "DIV") > + return false; > > We already check to see if we're in a div before calling this method, so do > we need this? This is just to make it explicit that we should only cover divs in this method. Just in case we start using this method in a different scope for some reason in the future. Not a big deal though, I can remove it (to avoid the overhead). > + if (childNodes.length !== 1) > > If a paragraph looks like this: > > <div> > <p>[paragraph></p> > </div> > > there will actually be 3 child nodes: the first whitespace Text node, the > <p> element, and the second whitespace Text node. But I think we would still > want to convert this to a single <p>. Nice catch, I forgot to cover the whitespace case here. > > + return !this._getInnerText(e) && > > I don't understand - why check to see if the <p> has no text? I thought the > problem is that slate uses paragraphs like this: > > <div><p>[paragraph]</p></div> > > and since the paragraphs are wrapped in <div>s, it's only using the <div> > with the most text. If only empty <p>s are converted, these paragraphs would > remain untouched, and the problem would still exist, right? "e" is the DIV, not the P. We don't want to convert the DIV into a P if we find things like this: <div>text text text <p>paragraph</p></div> I'll send a new patch to cover the review comments.
Comment on attachment 651904 [details] [diff] [review] Convert divs with only a <p> element child into plain <p> elements r- for new patch
Attachment #651904 - Flags: review?(bnicholson) → review-
Attachment #654155 - Flags: review?(bnicholson)
Attachment #651904 - Attachment is obsolete: true
Comment on attachment 654155 [details] [diff] [review] Convert divs with only a <p> element child into plain <p> elements let newNode = doc.createElement('p'); - newNode.innerHTML = node.innerHTML; + newNode.innerHTML = (pIndex >= 0 ? node.childNodes[pIndex].innerHTML : node.innerHTML); If we already have a <p> inside the <div>, can we just use that <p> as-is without creating a new one and avoiding the expensive innerHTML copying? Basically, just set newNode to the <p> if we're doing the replacement; otherwise, do doc.createElement() like we do now. Other than that, this looks good. r- for suggestion above.
Attachment #654155 - Flags: review?(bnicholson) → review-
Depends on: 785145
Attachment #654741 - Flags: review?(bnicholson)
Attachment #654155 - Attachment is obsolete: true
FYI: this patch needs the fix for bug 785145 in order to work properly.
Comment on attachment 654741 [details] [diff] [review] Convert divs with only a <p> element child into plain <p> elements This patch looks good, but can you explain the dependency on bug 785145? I tried this patch by itself and didn't notice any problems without bug 785145 applied.
Attachment #654741 - Flags: review?(bnicholson) → review+
Attachment #654841 - Flags: review?(bnicholson)
Attachment #654741 - Attachment is obsolete: true
Attachment #654841 - Flags: review?(bnicholson) → review?(mark.finkle)
Attachment #654842 - Flags: review?(bnicholson) → review?(mark.finkle)
Attachment #654842 - Flags: review?(mark.finkle) → review+
Attachment #654841 - Flags: review?(mark.finkle) → review+
Comment on attachment 654842 [details] [diff] [review] /785145 - Convert divs with only a <p> element child into plain <p> elements [Approval Request Comment] User impact if declined: Pages following the same pattern than mobile.slate.com (enclose all Ps inside DIVs) will show incomplete content in Reader Mode. Testing completed (on m-c, etc.): Local tests in a long list of websites, no regressions found. Risk to taking this patch (and alternatives if risky): Very low as it only applies to websites following this specific pattern. String or UUID changes made by this patch: None.
Attachment #654842 - Flags: approval-mozilla-aurora?
Comment on attachment 654841 [details] [diff] [review] Ensure newNode is orphan before replacing a node in JSDOMParser [Approval Request Comment] User impact if declined: Pages following the same pattern than mobile.slate.com (enclose all Ps inside DIVs) will show incomplete content in Reader Mode. Testing completed (on m-c, etc.): Local tests in a long list of websites, no regressions found. Risk to taking this patch (and alternatives if risky): Very low as it only applies to websites following this specific pattern. String or UUID changes made by this patch: None.
Attachment #654841 - Flags: approval-mozilla-aurora?
No longer blocks: 777557
No longer depends on: 785145
Priority: -- → P2
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 17
Comment on attachment 654841 [details] [diff] [review] Ensure newNode is orphan before replacing a node in JSDOMParser [Triage Comment] Low risk, mobile only, approving for Beta.
Attachment #654841 - Flags: approval-mozilla-aurora? → approval-mozilla-beta+
Attachment #654842 - Flags: approval-mozilla-aurora? → approval-mozilla-beta+
I cannot reproduce this issue for http://mobile.slate.com/posts/2012/06/22/new_law_threatens_mississippi_s_last_abortion_clinic.html on the latest Nightly, Aurora and Beta builds. Closing bug as verified fixed. -- Device: Galaxy Note OS: Android 4.0.4
Status: RESOLVED → VERIFIED
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: