Open Bug 322167 Opened 19 years ago Updated 2 years ago

Title chopping from bug 319004 might split a character in the middle

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

REOPENED

People

(Reporter: ajschult784, Unassigned)

Details

Attachments

(2 files)

------- Comment #61 From Christian Biesinger (:bi) (reviews slower than usual) 2005-12-17 12:58 PST [reply] ------- + nsAutoString titleString(StringHead(aTitle, HISTORY_TITLE_LENGTH_MAX)); this might split a character in the middle :-/ (a non-BMP one)
Even a string made purely of BMP chacters can be problematic because Unicode-character boundaries are not grapheme boundaries. In the absence of grapheme boundary checking API, perhaps we may chop it at the last 'space-like' (not necessarily space-like but any 'non-base' character) within the max length limit. Aha,,, if this is only about history corruption, then we just have to modify |StringHead(str, maxlen)| to rewind one PRUnichar and return one 'unit' fewer than maxlen if the last character happens to be a 'high surrogate character'.
Right, no need to be fancy, just have a valid string. I have a patch that should work, but I can't seem to create a page with a title that would trigger the problem. I made a page with the synthetic title: <title>&#xD955;&#xDDDD;&#xD955;&#xDDDD;&#xD955;&#xDDDD;</title> but IS_HIGH_SURROGATE(titleString.Last()) returns false. How can I make a title with BMP characters?
Surrogates in numeric entitities are treated as invalid. Try <title>&#x10451;&#x10472;&#x10451;&#x10469;&#x10464;</title>
Attached patch patch (deleted) — Splinter Review
thanks smontagu, that worked.
Attachment #207476 - Flags: superreview?(darin)
Attachment #207476 - Flags: review?(smontagu)
Attachment #207476 - Flags: superreview?(darin) → superreview+
Attachment #207476 - Flags: review?(smontagu) → review+
FIXED
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
ugh, this is unhappy now with an empty title (like for about:blank). nsTSubstring::Last reads memory from before the start of the empty buffer. A debug build asserts.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch patch (deleted) — Splinter Review
This makes it kick in only if it actually cut the string.
Attachment #207587 - Flags: superreview?(darin)
Attachment #207587 - Flags: review?(darin)
hmm, that makes me wonder if ...DependentSubstring shouldn't be doing this internally rather than callers having to worry about it. We never want to return a substring that isn't legal UTF-16, right?
Maybe you should use a more explicit conditional e.g. if (length < MAX) newstring = oldstring; else if (oldstring[MAX - 1] is surrogate) newstring = head(oldstring, MAX - 1); else newstring = head(oldstring, MAX);
Please check this in, even if you don't have reviews yet. The tree is orange, and really should be closed because of it.
Comment on attachment 207587 [details] [diff] [review] patch this is in the trunk
(In reply to comment #8) > hmm, that makes me wonder if ...DependentSubstring shouldn't be doing this > internally rather than callers having to worry about it. We never want to > return a substring that isn't legal UTF-16, right? I agree. That's what I meant in the 2nd paragraph of comment #1 with 'last character' replaced by 'last 16-bit unit'
Attachment #207587 - Flags: superreview?(darin)
Attachment #207587 - Flags: superreview+
Attachment #207587 - Flags: review?(darin)
Attachment #207587 - Flags: review+
Moving to XPCOM to figure out if there's anything more to do in DependentSubString, it's possible it has been fixed in the meanwhile. Otherwise this was RESO/FIXED years ago (patches landed but it was never closed)
Component: History: Global → XPCOM

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: ajschult784 → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: