Closed
Bug 100214
Opened 23 years ago
Closed 23 years ago
xpcom depends on unicharutil for case-insensitive unicode
Categories
(Core :: XPCOM, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla0.9.6
People
(Reporter: dougt, Assigned: alecf)
References
Details
Attachments
(10 files, 9 obsolete files)
(deleted),
patch
|
jag+mozilla
:
review+
sfraser_bugs
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
netscape
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
sspitzer
:
review+
sfraser_bugs
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jag+mozilla
:
review+
scc
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jag+mozilla
:
review+
scc
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jag+mozilla
:
review+
scc
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
scc
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
netscape
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jag+mozilla
:
review+
scc
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jag+mozilla
:
review+
scc
:
superreview+
|
Details | Diff | Splinter Review |
Assignee | ||
Comment 1•23 years ago
|
||
I started hacking on this a bit.
What I've discovered is that the nsCRT routines are the only code that requires
unicharutil. My original plan was to simply remove the case-oriented PRUnichar*
routines from nsCRT, and switch all consumers to the ns*String routines.
But I hit a snag: the ns*String routines use some of the nsCRT routines, in
bufferRoutines.h, that I wanted to eliminate! I figured I'd change THOSE
consumers to use unicharutil, which the string library already depends on. So
string is still just as dependent, but xpcom is free of this dependency.
Here's my plan, already underway
- remove case-oriented routines that operate on PRUnichar's from nsCRT, and fix
consumers
- in the string library, switch the consumers of the previous routines over to
unicharutil using the CaseConvert routines that already exist
I hope this won't be too much work. I could use help as well, especially if
someone (scc?) wants to convert bufferRoutines.h not to depend on nsCRT.h - that
would make my life MUCH easier.
Assignee | ||
Comment 2•23 years ago
|
||
Assignee | ||
Comment 3•23 years ago
|
||
ok, heres a patch that has its share of hacks, gross inefficiencies, and blatant
lies.
Basically what this does is shove off the unicharutil dependency over to the
string library:
- in string/obsolete, I shuffled around the gCaseConv stuff so that it could be
shared
- I eliminated the use of the nsCRT routines ToUpper/ToLower for PRUnichar stuff
- I made the nsCRT::str*casecmp stuff use the new fangled string library's
global Compare function
Problems with this patch:
- gross use of nsCRT::strlen() - the problem is that we need to
nsDependentStrings with their size set to the min of these two
- switch from kLatin1ToUCS2 array to NS_ConvertUTF8toUCS2 - I have no idea what
the impacts of this are.. the problem is basically that because we're handing a
whole string off to Compare(), we need a buffer with the char* -> UCS2
conversion already complete - really I should make a buffer on the stack and do
the conversion like the old way.
Anyway, this lets me compile the rest of the tree to see if there are any other
impacts.
Assignee | ||
Comment 4•23 years ago
|
||
actually, I just realized this bug is for the uconv dependency (not unicharutil)
I'm updating the summary, taking this, and opening another bug for uconv (which
is totally unrelated, having to do with nsFileSpec/nsLocalFile and friends)
Assignee: dougt → alecf
Summary: xpcom depends on uconv for doublebyte support → xpcom depends on unicharutil for doublebyte support
Target Milestone: --- → mozilla0.9.5
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 5•23 years ago
|
||
ok, I have a new plan, somewhat like the old.
The new plan is to eliminate all case-related APIs from all of XPCOM and String,
and make a utility library in unicharutil that resembles the string APIs.
For example, I'm eliminating nsCRT::ToUpper() and the string library's
::ToUpperCase(nsAString&)
But then I am declaring an implementation of ::ToUpperCase(nsAString&) in
nsUnicharUtil.h, and then making an implementation which can be statically
linked in.
I'd like to hear an opinion from scc on this. What I'd essentially be doing is
overloading a few global functions that string declares, in unicharutil.
The routines in question are:
ToLowerCase()
ToUpperCase()
CaseInsensitiveFindInReadable()
CaseInsensitiveStringComparator (class)
Assignee | ||
Comment 6•23 years ago
|
||
Assignee | ||
Comment 7•23 years ago
|
||
ok, that patch cleans up all uses of nsCRT::str*cmp to compare a PRUnichar* to a
char*. nsCRT::str*cmp was doing an implicit conversion from UTF8 to UCS2, so now
I make it explicit.
Can I get an r=/sr=? jag? scc?
Comment 8•23 years ago
|
||
From what I understand, it was doing a conversion from ISO Latin1 to UCS2, which
basically means tacking on another 8 bits, all 0, rather than UTF8 to UCS2,
which will do all kinds of nasty things if it sees bit 7 set.
Assignee | ||
Comment 9•23 years ago
|
||
doh! you're right.
actually, that seems really broken then.
ok, I may end up switching those to some temporary NS_ConvertLatin1toUCS2 or
something, or I might look at this stuff on a case-by-case basis. I _will_ kill
these routines.
Argh. this stuff sucks :)
Assignee | ||
Comment 10•23 years ago
|
||
actually, I forgot that most of the cases in that patch are NS_LITERAL_STRING's
in fact, there are only 5 uses in that patch, and 4 of them are questionable
conversions. I'm going to look at them and figure out how the strings are
supposed to be encoded.
I found a few more in mail, but they're all either candidates for
NS_LITERAL_STRING, or can be changed a bit to just use straight unicode.
Assignee | ||
Comment 11•23 years ago
|
||
Assignee | ||
Comment 12•23 years ago
|
||
ok, I went through the cases and it looks like:
nsInternetSearchService: we're just using a global constant char* - I'm pretty
sure you can't use NS_NAMED_LITERAL_STRING in a global context since that would
declare a global variable, so I'm doing an explicit conversion from ASCII.
mozTXTtoHTMLConv.cpp: It turns out the only time we needed to do the conversion
was with static strings - so I whacked the APIs to take const PRUnichar*, and
then made use of NS_LITERAL_STRING
nsComposerCommands: Same deal with editor - but since editor is in the midst of
a directory reorg, I didn't want to go changing APIs on them.
Other than that, all uses are NS_LITERAL_STRING. Comments?
Assignee | ||
Comment 13•23 years ago
|
||
by the way:
1) that's a diff -bw, I have fixed the funny indentation
2) don't mind the conflicts, that patch was made after I had just updated my
tree - I've resolved them appropriately.
Assignee | ||
Updated•23 years ago
|
Attachment #50632 -
Attachment is obsolete: true
Assignee | ||
Comment 14•23 years ago
|
||
I'd like to check in that 3rd patch there (attachment 50699 [details] [diff] [review]) - and would love
some reviewers..
Comment 15•23 years ago
|
||
Comment on attachment 50699 [details] [diff] [review]
updated patch - change APIs where necessary
>Index: xpfe/components/search/src/nsInternetSearchService.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/xpfe/components/search/src/nsInternetSearchService.cpp,v
>retrieving revision 1.146
>diff -u -b -w -r1.146 nsInternetSearchService.cpp
>--- nsInternetSearchService.cpp 2001/09/25 00:22:45 1.146
>+++ nsInternetSearchService.cpp 2001/09/25 16:03:11
>@@ -3342,7 +3343,7 @@
> };
>
> // make "ISO-8859-1" as the default (not "UTF-8")
>- stringEncoding= NS_ConvertASCIItoUCS2("ISO-8859-1");
>+ stringEncoding= NS_LITERAL_STRING("ISO-8859-1");
Might as well add a space before that = while you're there.
> PRUint32 loop = 0;
> while (encodingList[loop].numericEncoding != nsnull)
>Index: mailnews/base/search/src/nsMsgSearchAdapter.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/mailnews/base/search/src/nsMsgSearchAdapter.cpp,v
>retrieving revision 1.34
>diff -u -b -w -r1.34 nsMsgSearchAdapter.cpp
>--- nsMsgSearchAdapter.cpp 2001/07/25 07:52:31 1.34
>+++ nsMsgSearchAdapter.cpp 2001/09/25 16:03:12
>@@ -148,7 +148,7 @@
> nsCAutoString charset;
> charset.AssignWithConversion(destCharset);
> // Specify a character set unless we happen to be US-ASCII.
>- if (nsCRT::strcmp(destCharset, "us-ascii"))
>+ if (nsCRT::strcmp(destCharset, NS_LITERAL_STRING("us-ascii").get()))
You could compare destCharset and "us-ascii" directly.
> result = PR_smprintf("%s%s", nsMsgSearchAdapter::m_kImapCharset, (const char *) charset);
>
> return result;
>Index: mailnews/base/src/nsMsgAccountManager.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/mailnews/base/src/nsMsgAccountManager.cpp,v
>retrieving revision 1.235
>diff -u -b -w -r1.235 nsMsgAccountManager.cpp
>--- nsMsgAccountManager.cpp 2001/09/24 23:20:02 1.235
>+++ nsMsgAccountManager.cpp 2001/09/25 16:03:13
>@@ -996,7 +996,7 @@
> nsCOMPtr<nsIMsgFolder>inboxFolder = do_QueryInterface(aSupport);
> PRUnichar *folderName = nsnull;
> inboxFolder->GetName(&folderName);
>- if (folderName && nsCRT::strcasecmp(folderName, "INBOX") ==0)
>+ if (folderName && Compare(nsDependentString(folderName), NS_LITERAL_STRING("INBOX"), nsCaseInsensitiveStringComparator()) ==0)
> {
> rv1 = inboxFolder->Compact(urlListener, nsnull /* msgwindow */);
> if (NS_SUCCEEDED(rv1))
folderName is being leaked here. Use an nsXPIDLString and you can get rid
of that nsDependentString.
>Index: intl/chardet/src/nsMetaCharsetObserver.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/intl/chardet/src/nsMetaCharsetObserver.cpp,v
>retrieving revision 1.48
>diff -u -b -w -r1.48 nsMetaCharsetObserver.cpp
>--- nsMetaCharsetObserver.cpp 2001/09/19 23:59:58 1.48
>+++ nsMetaCharsetObserver.cpp 2001/09/25 16:03:13
>@@ -89,7 +89,7 @@
> const PRUnichar* nameArray[],
> const PRUnichar* valueArray[])
> {
>- if(0 != nsCRT::strcasecmp(aTag, "META"))
>+ if(0 != nsCRT::strcasecmp(aTag, NS_LITERAL_STRING("META").get()))
You converted one of these to a Compare(..., ..., nsCaseInsensitiveStringComparator) earlier.
Which style do you want to use?
> return NS_ERROR_ILLEGAL_VALUE;
> else
> return Notify(aDocumentID, numOfAttributes, nameArray, valueArray);
>Index: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp,v
>retrieving revision 1.43
>diff -u -b -w -r1.43 mozTXTToHTMLConv.cpp
>--- mozTXTToHTMLConv.cpp 2001/09/05 10:46:39 1.43
>+++ mozTXTToHTMLConv.cpp 2001/09/25 16:03:15
>@@ -547,7 +551,7 @@
>
> PRUint32
> mozTXTToHTMLConv::NumberOfMatches(const PRUnichar * aInString, PRInt32 aInStringLength,
>- const char* rep, PRInt32 aRepLen, LIMTYPE before, LIMTYPE after)
>+ const PRUnichar* rep, PRInt32 aRepLen, LIMTYPE before, LIMTYPE after)
Indentation nit, if you're willing to fix it (seemed like you were).
> {
> PRUint32 result = 0;
>
>@@ -564,7 +568,7 @@
> // NOTE: the converted html for the phrase is appended to aOutString
> PRBool
> mozTXTToHTMLConv::StructPhraseHit(const PRUnichar * aInString, PRInt32 aInStringLength, PRBool col0,
>- const char* tagTXT, PRInt32 aTagTXTLen,
>+ const PRUnichar* tagTXT, PRInt32 aTagTXTLen,
> const char* tagHTML, const char* attributeHTML,
> nsString& aOutString, PRUint32& openTags)
> {
>@@ -621,7 +625,7 @@
>
> PRBool
> mozTXTToHTMLConv::SmilyHit(const PRUnichar * aInString, PRInt32 aLength, PRBool col0,
>- const char* tagTXT, PRInt32 aTagTxtLength, const char* tagHTML,
>+ const PRUnichar* tagTXT, PRInt32 aTagTxtLength, const char* tagHTML,
> nsString& outputHTML, PRInt32& glyphTextLen)
> {
> PRInt32 tagLen = nsCRT::strlen(tagTXT);
Instead of const PRUnichar* you could change this to const nsAString&.
Assignee | ||
Comment 16•23 years ago
|
||
> Might as well add a space before that = while you're there.
/me rolls his eyes :)
> >- if (nsCRT::strcmp(destCharset, "us-ascii"))
> >+ if (nsCRT::strcmp(destCharset, NS_LITERAL_STRING("us-ascii").get()))
> You could compare destCharset and "us-ascii" directly.
No, I can't - that's the point of this whole patch. destCharset is a PRUnichar*
> result = PR_smprintf("%s%s", nsMsgSearchAdapter::m_kImapCharset, (const char
*) charset);
> > return result;
>Index: mailnews/base/src/nsMsgAccountManager.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/mailnews/base/src/nsMsgAccountManager.cpp,v
>retrieving revision 1.235
>diff -u -b -w -r1.235 nsMsgAccountManager.cpp
>--- nsMsgAccountManager.cpp 2001/09/24 23:20:02 1.235
>+++ nsMsgAccountManager.cpp 2001/09/25 16:03:13
>>- if (folderName && nsCRT::strcasecmp(folderName, "INBOX") ==0)
>>+ if (folderName && Compare(nsDependentString(folderName),
NS_LITERAL_STRING("INBOX"), nsCaseInsensitiveStringComparator()) ==0)
>> {
>> rv1 = inboxFolder->Compact(urlListener, nsnull /* msgwindow */);
>> if (NS_SUCCEEDED(rv1))
>folderName is being leaked here. Use an nsXPIDLString and you can get rid
>of that nsDependentString.
Ok, fixed.. and switched back to nsCRT::strcasecmp just for consistency
>You converted one of these to a Compare(..., ...,
>nsCaseInsensitiveStringComparator) earlier.
>Which style do you want to use?
I'll leave it as strcasecmp()
> mozTXTToHTMLConv::NumberOfMatches(const PRUnichar * aInString, PRInt32
aInStringLength,
>- const char* rep, PRInt32 aRepLen, LIMTYPE before, LIMTYPE after)
>+ const PRUnichar* rep, PRInt32 aRepLen, LIMTYPE before, LIMTYPE after)
Indentation nit, if you're willing to fix it (seemed like you were).
As they say "when in rome" - the whole file is indented in the style you see
here. It actually looks just fine within the context of surrounding code.
>>@@ -621,7 +625,7 @@
>> > PRBool
>> mozTXTToHTMLConv::SmilyHit(const PRUnichar * aInString, PRInt32 aLength,
PRBool col0,
>>- const char* tagTXT, PRInt32 aTagTxtLength, const char* tagHTML,
>>+ const PRUnichar* tagTXT, PRInt32 aTagTxtLength, const char* tagHTML,
>> nsString& outputHTML, PRInt32& glyphTextLen)
>> {
>> PRInt32 tagLen = nsCRT::strlen(tagTXT);
> Instead of const PRUnichar* you could change this to const nsAString&.
I'm going for minimal change here. We've had lots of problems with performance
in this particular file due to excess ns*String usage in tight loops (lots of
time spent in ns*String constructors, just ask waterson and mscott about it..)
so for now I'd rather pass around raw pointers and integers...
Attaching an updated patch to address these nits...
Comment 17•23 years ago
|
||
> > >- if (nsCRT::strcmp(destCharset, "us-ascii"))
> > >+ if (nsCRT::strcmp(destCharset, NS_LITERAL_STRING("us-ascii").get()))
>
> > You could compare destCharset and "us-ascii" directly.
>
> No, I can't - that's the point of this whole patch. destCharset is
> a PRUnichar*
I typo'd that, I meant |charset| (the nsCAutoString just constructed), not
|destCharset|. But anyways...
And the reason I mentioned the indentation thing in mozTXTToHTMLConv is that you
are in fact reindenting near line 490, hence the "seemed like you were" part of
that comment.
Comment 18•23 years ago
|
||
Comment on attachment 50699 [details] [diff] [review]
updated patch - change APIs where necessary
r=jag
Attachment #50699 -
Flags: review+
Comment 19•23 years ago
|
||
Comment on attachment 50699 [details] [diff] [review]
updated patch - change APIs where necessary
sr=sfraser
Attachment #50699 -
Flags: superreview+
Assignee | ||
Comment 20•23 years ago
|
||
ok, that last patch has landed. thanks folks.
next stop: eliminate bytewidth-insensitive comparisons from nsStr*
Assignee | ||
Updated•23 years ago
|
Priority: -- → P2
Assignee | ||
Comment 21•23 years ago
|
||
try as I might, I can't get enough of this done to land for 0.9.5..
I've run into a few issues:
- CaseInsensitiveStringComparator() needed to be moved into nsUnicharUtils
- CaseInsensitiveFindInReadable() is a little tricky.. we need to expose a
generic FindInReadable() that takes a comparator. I'm going to rename
FindInReadable_Impl() and expose that.
- nsStringArray::IndexOfIgnoreCase is evil! We're only using it once in
nsPresShell, but I haven't figured out a great way to work around it.
Fortunately, the only consumer is the old viewer app...
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Assignee | ||
Comment 22•23 years ago
|
||
This is going to be a very complex landing - I need to land things in stages...
I'm attaching two patches:
1) Add a new CaseCompare() to nsICaseConversion so that we can do fast
case-insensitive comparisons inside of nsICaseConversion
2) try to migrate everyone away from the (Assign|Equals)WithConversion
routines.. first step is to remove all the obvious conversions to
NS_LITERAL_STRING() - this covers about 70% of the usages.
Summary: xpcom depends on unicharutil for doublebyte support → xpcom depends on unicharutil for case-insensitive unicode
Assignee | ||
Comment 23•23 years ago
|
||
Assignee | ||
Comment 24•23 years ago
|
||
ok, after a few runs at this and hitting a few dead-ends, here's my plan of
action.. I need to write it down so I don't keep forgetting it:
- eliminate all callers to nsCRT::str*casecmp that reference PRUnichar*
parameters, switching them to the newest string APIs
- add nsUnicharUtils.h/.cpp to the build, moving case-insensitive PRUnichar*
routines into it. Then, #if 0 out the lines in the header, so that I can procede
with the following without changing projects, etc...
- hand-roll temporary string comparison routines for the old string library.
(This eliminates xpcom's dependency on unicharutil, but not string or
string_obsolete)
- eliminate all callers to all *WithConversion (old API) routines.. 80-90% of
these seem to be operating directly on literal strings, so:
- first pass, convert literal string usage to NS_LITERAL_STRING
- second pass, convert all other uses
- add unicharutil_s.lib (empty at this point) to all consumer libraries that
require it. (I'll need mac help here)
- turn on #if 0'ed code!
- actually remove nsCRT::str*casecmp routines from nsCRT.h/cpp
I've done some of this work already. I'm starting to attach patches.
Assignee | ||
Updated•23 years ago
|
Priority: P2 → P1
Assignee | ||
Comment 25•23 years ago
|
||
Assignee | ||
Comment 26•23 years ago
|
||
the above patch gives us a faster case comparison by avoiding virtual method
calls, and consolidates the ToLower() stuff into one function, instead of
spreading it across the implementations.
can I get an r/sr=?
Assignee | ||
Updated•23 years ago
|
Attachment #52088 -
Attachment is obsolete: true
Assignee | ||
Comment 27•23 years ago
|
||
Assignee | ||
Comment 28•23 years ago
|
||
ok, that last patch:
- removes all consumers of nsCRT::str[n]casecmp, and switches over to the new
string libraries for the comparison...
- converts any such files over to using NS_LITERAL_STRING where we were using
AssignWithConversion (there are many more, but I'll put those in a seperate patch)
- adds the inclusion of nsUnicharUtils.h to the necessary files, and adds the
exporting (win, unix) and linking (windows only). Right now nsUnicharUtils and
friends are #if 0'd out.
Can I get a preliminary review to make sure I'm not smoking crack here, jag?
Also, I'll attach a patch to the exporting of nsUnicharUtils.h to the mac build.
After a reviewed version of this patch is checked in, I'll add linux and mac
linking.
Then, finally I'll throw the switch to turn ON nsUnicharUtils.h, and turn OFF
the old routines in nsAString.h/nsReadableUtils.h.
Assignee | ||
Comment 29•23 years ago
|
||
Comment 30•23 years ago
|
||
gee... I don't know why we really need to do all these change. Maybe the easier
way to do it just move the Unicode base case conversion code into xpcom.
Reporter | ||
Comment 31•23 years ago
|
||
no! in this case, smaller is better. Unicode base case conversion can and
SHOULD BE optional.
Assignee | ||
Comment 32•23 years ago
|
||
frank, the whole point of this bug is to move non-xpcom libraries OUT of XPCOM.
There are actually only two somewhat minor dependencies:
- case-insensitive comparison and case conversion for PRUnichar* strings
- filesystem-charset support
This bug is for the first one.
Comment 33•23 years ago
|
||
>no! in this case, smaller is better. Unicode base case conversion can and
>SHOULD BE optional.
Do you know how BIG is the Unicode case conversion before you said that ?
Comment 34•23 years ago
|
||
>Unicode base case conversion can and SHOULD BE optional.
Why ?
Assignee | ||
Comment 35•23 years ago
|
||
frank, doug (owner of XPCOM) has already decided that unicode case conversion
should not be a part of XPCOM, and a number of the XPCOM peers (myself included)
agree with him. Its not just a matter of size but also of appropriateness.
this simply doesn't belong in XPCOM, all we're doing is moving the dependency
out of xpcom and onto the consumers of these functions...
we are removing the nsCRT::strcasecmp and nsCRT::strncasecmp routines for
unicode, and need to support them instead through the new string APIs. I order
to do this and maintain (actually, improve) string performance, we need to add
this one comparison method to nsICaseConversion.
Assignee | ||
Comment 36•23 years ago
|
||
ok, I'm breaking this out into a couple different bugs so we can focus our
discussions
see:
bug 104122: adding a case-insensitive compare to nsICaseConversion
bug 104126: removing case-insensitive compares from nsStringArray
Assignee | ||
Updated•23 years ago
|
Attachment #52613 -
Attachment is obsolete: true
Assignee | ||
Updated•23 years ago
|
Attachment #50009 -
Attachment is obsolete: true
Assignee | ||
Comment 37•23 years ago
|
||
Comment 38•23 years ago
|
||
Comment on attachment 52998 [details] [diff] [review]
simply turn on the "util" directory, and make it come first
r=cls
Attachment #52998 -
Flags: review+
Assignee | ||
Updated•23 years ago
|
Attachment #52617 -
Attachment is obsolete: true
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 39•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #52614 -
Attachment is obsolete: true
Assignee | ||
Comment 40•23 years ago
|
||
I found some flaws in the way I converted nsMetaCharsetObserver.... that is an
updated patch. Anyone care to review?
The good news is that I've checked in the changes to add nsUnicharUtils.h to the
build, and build the libraries on windows and unix, so this patch can land as-is
once I get some eyes on it.
can I get some reviewers? please? :)
Assignee | ||
Comment 41•23 years ago
|
||
Assignee | ||
Comment 42•23 years ago
|
||
once bug 104122 lands, I can check in attachment 53032 [details] [diff] [review]. Can I get some reviewers
there? The only files affected are bufferRoutines.h and nsString2.cpp
after that and attachment 53020 [details] [diff] [review] land, there should be no more consumers of
nsCRT::strn?casecmp.
Comment on attachment 53032 [details] [diff] [review]
remove string/obsolete's dependency on nsCRT::strn?casecmp
sr=shaver, pending the results of alecf's regime of thorough testing (the results of which will be posted to this bug shortly-ish).
Attachment #53032 -
Flags: superreview+
Comment 44•23 years ago
|
||
Comment on attachment 53020 [details] [diff] [review]
updated nsCRT::strn?cmp changes
r=sspitzer on the mailnews stuff.
please make sure to test.
Attachment #53020 -
Flags: review+
Comment 45•23 years ago
|
||
Comment on attachment 53020 [details] [diff] [review]
updated nsCRT::strn?cmp changes
>Index: intl/chardet/src/nsMetaCharsetObserver.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/intl/chardet/src/nsMetaCharsetObserver.cpp,v
>retrieving revision 1.52
>diff -u -r1.52 nsMetaCharsetObserver.cpp
>--- nsMetaCharsetObserver.cpp 2001/10/01 20:53:56 1.52
>+++ nsMetaCharsetObserver.cpp 2001/10/11 00:11:48
>@@ -225,25 +240,35 @@
> if(nsnull == httpEquivValue || nsnull == contentValue)
> return NS_OK;
>
>- while(IS_SPACE_CHARS(*httpEquivValue))
>+ while(IS_SPACE_CHARS(*httpEquivValue)) {
> httpEquivValue++;
>- while(IS_SPACE_CHARS(*contentValue))
>+ httpEquivLen--;
>+ }
>+ while(IS_SPACE_CHARS(*contentValue)) {
> contentValue++;
>+ contentLen--;
>+ }
Do you need contentLen and httpEquivLen? Doesn't look like it.
> if(
>- ((0==nsCRT::strcasecmp(httpEquivValue,contenttype.get())) ||
>+ // first try unquoted strings
>+ ((0==Compare(nsDependentString(httpEquivValue),contenttype,
>+ nsCaseInsensitiveStringComparator())) ||
>+ // now try "quoted" or 'quoted' strings
> (( (httpEquivValue[0]=='\'') ||
> (httpEquivValue[0]=='\"') ) &&
>- (0==nsCRT::strncasecmp(httpEquivValue+1,
>- contenttype.get(),
>- contenttype.Length()))
>+ (0==Compare(nsDependentString(httpEquivValue+1, contenttype.Length())),
>+ contenttype,
>+ nsCaseInsensitiveStringComparator()))
Indentation.
Assignee | ||
Comment 46•23 years ago
|
||
ack, you're right... I needed them in a previous iteration of this patch.. they
have now been removed.
as for indentation, dammit it looks right in emacs! :)
stupid tabs. I'm fixing them up as best I can, but this file is full of tabs.
does that count as r=jag then? :)
Comment 47•23 years ago
|
||
r=/me
Comment 48•23 years ago
|
||
Comment on attachment 53020 [details] [diff] [review]
updated nsCRT::strn?cmp changes
sr=sfraser
Attachment #53020 -
Flags: superreview+
Assignee | ||
Comment 49•23 years ago
|
||
thanks so much! That patch has now been checked in... now I can focus on the
last patch.. for some reason I'm asserting on startup with it.
Comment 50•23 years ago
|
||
The patch you just checked in caused bizarre bustage on Sun WS 5.0 (nebiros
tinderbox):
Error: Different types for "?:" (nsAString and void).
It didn't seem to be a parsing problem as parenthisization didn't help. So I
fixed it by moving the conditional deeper into the expression:
- !(before == LT_IGNORE ? !Compare(nsDependentString(aInString),
nsDependentString(rep, aRepLen), nsCaseInsensitiveStringComparator()) :
- !Compare(nsDependentString(aInString+1), nsDependentString(rep,
aRepLen), nsCaseInsensitiveStringComparator()))
+ !(before == !Compare(nsDependentString(aInString + (LT_IGNORE ? 0 : 1)
), nsDependentString(rep, aRepLen), nsCaseInsensitiveStringComparator()))
Assignee | ||
Comment 51•23 years ago
|
||
maybe we should just change this to an if() statement instead of a ?:... your
patch, while I'm sure it works, seems confusing to me :)
Assignee | ||
Comment 52•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #53032 -
Attachment is obsolete: true
Assignee | ||
Comment 53•23 years ago
|
||
So after lots of testing, the only thing I found was that in Compare2To1, I had
mixed up the 1st and 2nd parameters when iterating through the string.
The only changes were:
- swapping the kIsoLatin1toUCS2 conversion in Compare2To1
- renaming kIsoLatin1toUCS2 with a "_2" at the end so as not to conflict with
the one in xpcom/ds (which is going away, and shouldn't be conflicting because
it's static, but the windows linker is complaining)
in any case, the patch is almost identical, and I've done lots of testing...
jag, can you r= this? shaver, do you mind another sr=?
once this patch lands, I can remove about 6 methods from nsCRT, which will
remove the unicharutil dependency in xpcom.
Then I can tackle the removal from string :)
Comment 54•23 years ago
|
||
Comment on attachment 53828 [details] [diff] [review]
next iteration of string/obsolete
sr=scc
Attachment #53828 -
Flags: superreview+
Assignee | ||
Comment 55•23 years ago
|
||
Assignee | ||
Comment 56•23 years ago
|
||
Comment 57•23 years ago
|
||
Comment on attachment 53828 [details] [diff] [review]
next iteration of string/obsolete
r=jag
Attachment #53828 -
Flags: review+
Comment 58•23 years ago
|
||
Comment on attachment 53882 [details] [diff] [review]
minor tree spam to fix consumers of ToLowerCase/ToUpperCase
r=jag
Attachment #53882 -
Flags: review+
Comment 59•23 years ago
|
||
Comment on attachment 53883 [details] [diff] [review]
Minor tree spam to wean people off nsCRT::(To|Is)(Upper|Lower)
I wonder if we should provide an IsUpper / IsLower, which could be cheaper than ToUpper(ch) == ch.
Assignee | ||
Comment 60•23 years ago
|
||
I considered that, but it involves adding extra methods to the already large
interface nsICaseConversion. I think the == test is reasonably fast, and adding
support to nsCaseConversionImp2 is going to be somewhat ugly.
Comment 61•23 years ago
|
||
Comment on attachment 53883 [details] [diff] [review]
Minor tree spam to wean people off nsCRT::(To|Is)(Upper|Lower)
r=jag
Attachment #53883 -
Flags: review+
Comment 62•23 years ago
|
||
Alec: Could patch 53020 be backed out? It seems it created a very visble
regression (no *bold* and smileys in mail/news). See bug 105459.
Assignee | ||
Comment 63•23 years ago
|
||
no, we'll fix that bug the right way.
Comment 64•23 years ago
|
||
FWIW, there are two bugs that occur in a number of places in that patch:
* use of nsDependentString as non-flat (see bug 104651)
* forgetting to use nsDependentString as non-flat to simulate the N in
strn[|case]cmp :-)
Assignee | ||
Comment 65•23 years ago
|
||
I'm lost. can you explain what I should have been doing there?
Comment 66•23 years ago
|
||
Probably the technically correct thing to do would be
Substring(nsDependentString(foo),0,7) to get an nsAString& for the first 7
characters of foo.
Assignee | ||
Comment 67•23 years ago
|
||
ok, that's fine too. are you planning on catching all of these in a pass through
the tree? (i.e. removing the Length parameter from the nsDependentString
constructor)
Comment 68•23 years ago
|
||
I'm not sure. See bug 104651.
Comment 69•23 years ago
|
||
And, FWIW, I think using nsDependentString here will work fine, so perhaps
you're better off just using it the way you did in some cases even though it's a
bad patter that, if copied to uses other than compares, could lead to incorrect
results.
Assignee | ||
Comment 70•23 years ago
|
||
well, I knew it didn't null-terminate (either by copying or by munging the
string) but figured it would be ok with Compare() - I didn't want to make an
extra copy of the string, since I'm just comparing.
Assignee | ||
Comment 71•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #54135 -
Attachment is obsolete: true
Assignee | ||
Comment 72•23 years ago
|
||
Assignee | ||
Comment 73•23 years ago
|
||
Comment 74•23 years ago
|
||
Comment on attachment 53882 [details] [diff] [review]
minor tree spam to fix consumers of ToLowerCase/ToUpperCase
sr=scc
Attachment #53882 -
Flags: superreview+
Comment 75•23 years ago
|
||
Comment on attachment 53883 [details] [diff] [review]
Minor tree spam to wean people off nsCRT::(To|Is)(Upper|Lower)
looks good, except that I'm hating the old-style casts (at the very end of your patch). In your case, you could be using construction-style casts anyway. Other than that, sr=scc.
Attachment #53883 -
Flags: superreview+
Comment 76•23 years ago
|
||
Comment on attachment 54141 [details] [diff] [review]
take 2 - add ToUpperCase(PRUnichar) and friends, plus add shutdown observer
sr=scc
Attachment #54141 -
Flags: superreview+
Comment 77•23 years ago
|
||
Comment on attachment 54148 [details] [diff] [review]
add libunicharutil_s.so to all necessary unix projects
r=cls
Attachment #54148 -
Flags: review+
Assignee | ||
Comment 78•23 years ago
|
||
ok the following patches are in:
the string/obsolete
the ToLowerCase/ToUpperCase for nsAStrings
the adding of ToUpperCase(PRUnichar) and ToLowerCase(PRUnichar)
adding libunicharutil_s to necessary unix projects
This is what's left, in this order:
1) fixing up mac projects (hopefully today)
2) landing to (To|Is)(Upper|Lower)Case for PRUnichar (today, requires #1)
3) Yanking all nsICaseConversion out of xpcom (tomorrow, after builds on all
platforms)
So very close...!
Assignee | ||
Comment 79•23 years ago
|
||
ok, mac project files have finally landed..very soon we'll be able to flip the
switch. I've done testing on windows and mac, unix is the only one left
Assignee | ||
Comment 80•23 years ago
|
||
Assignee | ||
Comment 81•23 years ago
|
||
ok, that patch finally removes all calls that would require calls into the
unicode library, and turns on their equivalent calls in nsUnicharUtils.cpp.
One thing you'll notice is that I've added a 2nd operator() for the base type,
and merged the two comparators used in FindInReadable() and Compare().. you'll
find a i18n implementation of this comparator in nsUnicharUtils.cpp
All callers to old APIs (nsCRT) have been converted over to the new string
library, and have nsUnicharUtils.h properly included. This means that once I
land this patch, everyone will be using the new set of routines..
Can I get a r=jag, and sr=scc?
Assignee | ||
Comment 82•23 years ago
|
||
Assignee | ||
Comment 83•23 years ago
|
||
and that patch simply removes all routines from nsCRT:: that did any dorky
conversion or calls into nsICaseConversion. The dependency is dead! Long live
the dependency.
can I get r=jag and sr=scc on that one too?
Comment 84•23 years ago
|
||
Comment on attachment 55324 [details] [diff] [review]
the final patch to the string library
>Index: string/public/nsAString.h
>===================================================================
>RCS file: /cvsroot/mozilla/string/public/nsAString.h,v
>retrieving revision 1.9
>diff -u -r1.9 nsAString.h
>--- nsAString.h 2001/10/13 15:01:16 1.9
>+++ nsAString.h 2001/10/26 22:17:06
>@@ -658,15 +659,9 @@
> {
> public:
> virtual int operator()( const char_type*, const char_type*, PRUint32 aLength ) const;
>+ virtual int operator() ( char_type, char_type ) const;
Indentation.
>Index: string/src/nsAString.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/string/src/nsAString.cpp,v
>retrieving revision 1.10
>diff -u -r1.10 nsAString.cpp
>--- nsAString.cpp 2001/10/13 15:01:20 1.10
>+++ nsAString.cpp 2001/10/26 22:17:06
>@@ -34,10 +34,12 @@
> }
>
> int
>-nsCaseInsensitiveStringComparator::operator()( const char_type* lhs, const char_type* rhs, PRUint32 aLength ) const
>- {
>- return nsCRT::strncasecmp(lhs, rhs, aLength);
>- }
>+nsDefaultStringComparator::operator()( char_type lhs, char_type rhs) const
>+ {
>+ if (lhs == rhs) return 0;
>+ if (lhs < rhs) return -1;
>+ return 1;
>+ }
return lhs - rhs;
>@@ -537,10 +539,31 @@
> return nsCharTraits<char_type>::compare(lhs, rhs, aLength);
> }
>
>+PRBool
>+nsDefaultCStringComparator::operator()( char_type lhs, char_type rhs ) const
>+ {
>+ if (lhs == rhs) return 0;
>+ if (lhs < rhs) return -1;
>+ return 1;
>+ }
>+
return lhs - rhs;
> int
> nsCaseInsensitiveCStringComparator::operator()( const char_type* lhs, const char_type* rhs, PRUint32 aLength ) const
> {
> return nsCRT::strncasecmp(lhs, rhs, aLength);
>+ }
>+
>+PRBool
>+nsCaseInsensitiveCStringComparator::operator()( char lhs, char rhs ) const
>+ {
>+ if (lhs == rhs) return 0;
>+
>+ lhs = nsCRT::ToUpper(lhs);
>+ rhs = nsCRT::ToUpper(rhs);
>+
>+ if (rhs == lhs) return 0;
>+ if (rhs < lhs) return -1;
>+ return 1;
Wrong way around.
And again you could use:
return lhs - rhs;
>Index: string/src/nsReadableUtils.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/string/src/nsReadableUtils.cpp,v
>retrieving revision 1.25
>diff -u -r1.25 nsReadableUtils.cpp
>--- nsReadableUtils.cpp 2001/10/02 11:15:45 1.25
>+++ nsReadableUtils.cpp 2001/10/26 22:17:07
>@@ -429,7 +411,8 @@
> while ( !found_it )
> {
> // fast inner loop (that's what it's called, not what it is) looks for a potential match
>- while ( aSearchStart != aSearchEnd && !compare(*aPatternStart, *aSearchStart) )
>+ while ( aSearchStart != aSearchEnd &&
>+ compare(*aPatternStart, *aSearchStart) )
|compare| returns 0 if the first argument and the second argument are equal, so |!compare|
or rather |compare == 0| is what you want.
>@@ -466,7 +449,7 @@
>
> // else if we mismatched ... it's time to advance to the next search position
> // and get back into the `fast' loop
>- if ( !compare(*testPattern, *testSearch) )
>+ if ( compare(*testPattern, *testSearch) )
Same here.
Assignee | ||
Comment 85•23 years ago
|
||
I adjusted the lhs vs. rhs stuff, but you're actually wrong about the compare()
- we're actually looking for non-matches in those cases - the old comparator
used to be a PRBool, so I've logically reversed the use of compare(), that's why
they're in the patch
Comment 86•23 years ago
|
||
Doh!
You're right. I was being silly and thinking that we were using |compare|
consistently with the |compare|s everyone's familiar with. Your patch makes
things better.
Comment 87•23 years ago
|
||
Comment on attachment 55325 [details] [diff] [review]
finally rip unicharutil out of xpcom!
r=jag
Attachment #55325 -
Flags: review+
Comment 88•23 years ago
|
||
I just realized that for the case insensitive compare you probably want to use
ToLower, not ToUpper, since ToLower will work for both uppercase and titlecase.
Assignee | ||
Comment 89•23 years ago
|
||
ah, good call.. I noticed that we were being inconsistent, but I didn't know
why.. so I just mimiced what the code does today. I like your argument about
lower vs. upper vs. titlecase, so I'll go with lower case.
Comment 90•23 years ago
|
||
Comment on attachment 55324 [details] [diff] [review]
the final patch to the string library
some trouble with spacing; also, why not |return lhs-rhs;| and no tests? Clients are only guaranteed a result <, ==, or > |0|, not { -1, 0, 1 }.
Fix the spacing (near the top of the patch) and either convince me I'm wrong about the results of compares, or else fix them, then you have my sr.
Comment 91•23 years ago
|
||
Comment on attachment 55325 [details] [diff] [review]
finally rip unicharutil out of xpcom!
removal is beautiful, sr=scc
Attachment #55325 -
Flags: superreview+
Assignee | ||
Comment 92•23 years ago
|
||
Comment 93•23 years ago
|
||
Comment on attachment 55624 [details] [diff] [review]
final string patch, take 2
sr=scc
Attachment #55624 -
Flags: superreview+
Comment 94•23 years ago
|
||
Comment on attachment 55624 [details] [diff] [review]
final string patch, take 2
r=jag
Attachment #55624 -
Flags: review+
Assignee | ||
Updated•23 years ago
|
Attachment #55324 -
Attachment is obsolete: true
Assignee | ||
Comment 95•23 years ago
|
||
after much burning, "final string patch" has been checked in.
one more patch to go ("finally rip unicharutil out of xpcom") and I'll be done
with this one.
Assignee | ||
Comment 96•23 years ago
|
||
yay! finally fixed.
praise the string gods.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•