Reader mode skips <h1> headers
Categories
(Toolkit :: Reader Mode, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox86 | --- | fixed |
People
(Reporter: hub, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [reader-mode-readability-algorithm])
Comment 1•8 years ago
|
||
Updated•8 years ago
|
Comment 4•8 years ago
|
||
Comment 8•7 years ago
|
||
Comment 9•7 years ago
|
||
Comment 10•6 years ago
|
||
I'm curious, why don't you just ditch the <title> contentsinstead?
After all, the top <h1> is also what a user would see in the content area at the top in regular mode. <title> often has unnecessary details/SEO things like "XYZ ARTICLE - blah site" while <h1> is often more succinct and/or has a more detailed subtitle. Maybe a leading <h1> should just take <title>s place in general, even if there's just one <h1>, and even if it has similar text to <title>. (the exception being when there is no leading <h1> or anything else that is clearly the heading, then it seems very reasonable to fall back to <title>)
Comment 11•6 years ago
|
||
(In reply to jonas from comment #10)
I'm curious, why don't you just ditch the <title> contentsinstead?
After all, the top <h1> is also what a user would see in the content area at the top in regular mode. <title> often has unnecessary details/SEO things like "XYZ ARTICLE - blah site" while <h1> is often more succinct and/or has a more detailed subtitle.
We actually take the title displayed in reader mode from metadata when present, which is more reliable than either <title> or the top <h1>. Where we do rely on <title>, there's code to strip out the SEO stuff.
Maybe a leading <h1> should just take <title>s place in general, even if there's just one <h1>, and even if it has similar text to <title>. (the exception being when there is no leading <h1> or anything else that is clearly the heading, then it seems very reasonable to fall back to <title>)
Then we'd do the wrong thing in cases like this, right?
<title>Firefox is awesome</title>
<h1>NiceTown Local Paper</h1>
<h2>Firefox is awesome</h2>
Comment 12•6 years ago
|
||
Well yes, and when <h1> has more information / a more detailed subtitle, which I sometimes do on my own pages. (since the SEO / more general stuff in <title> eats up space so I put more brief titles there.)
So yes, there are cases where this is a bad idea, and I still think the best idea is just to ALWAYS show <h1> if available. Again after all, it is what is usually at the top as a heading in the content area, which I'd say usually is for a reason. The <title> or metadata only makes sense to me if there is no obvious alternative like a clear <h1> available
Comment 13•6 years ago
|
||
I'm just thinking, couldn't you do a similarity score based on similar length & similar words based on some common word edit distance metric? And then only throw out the <h1> if it's relatively similar, and not like twice the length with way more info?
Surely there must be some heuristic that while for sure will fail again in some corner cases, at least does some of the common cases more justice than just dropping the <h1> without ever even looking at its contents
Comment 14•4 years ago
|
||
Fixed by bug 1685571.
Description
•