Closed Bug 57720 Opened 24 years ago Closed 24 years ago

Armenian character encoding picked if language dropdown doesn't completely populate [and thus can't login to bugzilla] [form submission encoding wrong]

Categories

(Core :: Internationalization, defect, P3)

x86
All
defect

Tracking

()

VERIFIED FIXED
mozilla0.9.1

People

(Reporter: sgifford+mozilla-old, Assigned: jbetak)

References

Details

(Keywords: intl)

Attachments

(5 files)

I'm using a recent (10/23/2000) CVS build of Mozilla; I've seen this since at least Friday (10/20/2000). When I fill out a form that includes parentheses or a period, the data is percent-encoded incorrectly. All other characters are encoded correctly. I'm seeing this: Char Correct Encoding Mozilla Encoding ( %28 %A5 ) %29 %A4 . %2E %A9 Exactly what I'm seeing is submitting this text in a <textarea>: !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\] ^_`abcdefghijklmnopqrstuvwxyz{|} (with no CRs or LFs) results in this encoded: QUERY_STRING="textarea-input=+%21%22%23%24%25%26%27%A5%A4*%2B%2C-%A9 %2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B %5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D" This encoding problem makes it impossible for me to use Bugzilla from Mozilla, since the period in my email address is misencoded.
From walking through the debugger, I think this happens in the network code.
Assignee: rods → gagan
Attached file simple testcase (deleted) —
I just don't see a problem. with Friday's trunk build or today's branch build. Output from running testcase: Remote Host = '205.217.229.106' User Agent = 'Mozilla/5.0 (Windows; U; WinNT4.0; en-US; m18) Gecko/20000518' Request Method = 'GET' Content Length = '' textarea .() ENV = 'SERVER_SOFTWARE->Apache/1.3.9 (Unix) PHP/4.0.2 PHP/3.0.16 mod_perl/1.21 FrontPage/4.0.4.3 GATEWAY_INTERFACE->CGI/1.1 DOCUMENT_ROOT->/home/serve/pollmann UNIQUE_ID->OfSwPs8ImBkAAA2FhtU REMOTE_ADDR->205.217.229.106 SERVER_PROTOCOL->HTTP/1.1 REQUEST_METHOD->GET REMOTE_HOST->205.217.229.106 QUERY_STRING->textarea=.%28%29 HTTP_USER_AGENT->Mozilla/5.0 (Windows; U; WinNT4.0; en-US; m18) Gecko/20000518 PATH->/usr/local/bin:/usr/bin:/bin HTTP_ACCEPT->*/* HTTP_CONNECTION->keep-alive REMOTE_PORT->4827 SERVER_ADDR->207.8.157.240 HTTP_ACCEPT_LANGUAGE->en HTTP_KEEP_ALIVE->300 SCRIPT_NAME->/echo.cgi HTTP_ACCEPT_ENCODING->gzip,deflate,compress,identity SCRIPT_FILENAME->/home/serve/pollmann/echo.cgi SERVER_NAME->pollmann.net REQUEST_URI->/echo.cgi?textarea=.%28%29 SERVER_PORT->80 HTTP_HOST->pollmann.net SERVER_ADMIN->webmaster@POLLMANN.NET '
The problem appears to be at nsFormFrame.cpp:975. When I submit .() , on the line that says: inBuf = UnicodeToNewBytes(aString.GetUnicode(), aString.Length(), encoder); after calling, inBuf contains (gdb) p inBuf $29 = 0x8661510 "йед" (gdb) x/3x inBuf 0x8661510: 0xa9 0xa5 0xa4 . The incorrect encoding I'm seeing is %A9%A5%A4, which corresponds to this exactly. I haven't been able to drill it down any deeper than that; I'll try when I get home. I'm compiling with gcc 2.96, a somewhat experimental version; if you're not seeing this, perhaps it's a compiler issue. What are you building with?
not in networking. form data. ->pollmann(?) pls. reassign if its someone else.
Assignee: gagan → pollmann
Traced it further to umap.c:112. If I have a dot on the way in: (gdb) p in $178 = 46 I get this cell from the mapping table: (gdb) p *uCell $179 = {fmt = {format0 = {srcBegin = 40, srcEnd = 46, destBegin = 0}, format1 = {srcBegin = 40, srcEnd = 46, mappingOffset = 0}, format2 = { srcBegin = 40, srcEnd = 46, destBegin = 0}}} which maps to the wrong character: (gdb) p *out $180 = 169 This seems to happen no matter what my encoding, but I'll play with it a little more and see if I can figure out which map this is coming from.
This mapping seems to be coming from the Armenian character set, the first one alphabetically: $ cat armscii.uf [ ... ] Begin of Item 0002 Format 1 srcBegin = 0028 srcEnd = 002E mappingOffset = 0000 Mapping = 00A5 00A4 002A 002B 002C 002D 00A9 End of Item 0002 [ ... ] Which leads to 3 questions: 1. Is this the correct behavior for the Armenian character set? 2. Why did my copy of Mozilla decide this was the best character set for it? 3. Why is it still using it when I have changed my default to ISO-8859-1? Can somebody change their character set to Armenian, and see if they see this bug also? Thanks!
Removing all traces of "armscii" from my prefs.js, the one bookmark that contained "armscii", and clearing my cache (which also contained the word "armscii" for some reason) cleared it up. Probably some previous version of Mozilla set my default charset to Armenian, and then something in this build made that be a problem? I don't know; it still seems strange to me that the default behavior for Armenian would be for forms to not work, but then maybe Armenia isn't a real interactive sort of place . . . :-) I'm moving this to Internationalization. Sorry for filling up your mailboxes with this gibberish.
Component: Form Submission → Internationalization
This looks like an Internationalization bug; changing assigned to Intl. owner. Intl. owner -- brief summary. Problem 1. My browser somehow became set to Armenian (ARMSCII). That is the first on my list alphabetically, which seems likely to have something to do with it. Problem 2. When that happened, form submission broke completely (see details above). I don't know if this is a problem, or if this would actually work if I was really using an Armenian character set. Thanks, -----Scott.
Assignee: pollmann → nhotta
QA Contact: vladimire → teruko
Problem 1 If this is still reproducible with current build by creating a new profile, it's a bug. Otherwise, it can be treated as worksforme. Problem 2 It looks like that's a valid behavior. I searched for "ARMSCII-8" using yahoo and found a link. http://moon.yerphi.am/~hovik/ArmSCII/armcs-006.html
I couldn't get it to go back to Armenian without manually setting it. I'll mark this WORKSFORME. Dammit...It took me almost 12 hours to track it down to that Armenian thing, and all I get is a lousy WORKSFORME... :-)
Status: UNCONFIRMED → RESOLVED
Closed: 24 years ago
Resolution: --- → WORKSFORME
Scott, sounds like you went to a lot of trouble tracking this one down - we really appreciate it! Perhaps if someone finds a similar sort of problem they will search through Bugzilla, find this report and be able to quickly understand what is going on. Thanks for taking the time to investigate!
Already came in handy - see bug 57946!
Summary: Incorrect form/CGI encoding of parentheses and period (%28, %29, %2E) → non Western Character Coding -> Incorrect form/CGI encoding of parentheses and period (%28, %29, %2E)
I'm going to reopen this. I saw this on the branch today with a brand new profile in 2000110609mn6. I would have never caught this had someone not told me about it.
Status: RESOLVED → UNCONFIRMED
Resolution: WORKSFORME → ---
From some brief testing, it appears that this happens if you change the default font. I changed it to arial, then noted that the character coding changed to armenian again.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I can reproduce this. I changed Western Sans Serif font from "Arial" to "Arial Black". At this point a charset did not change. But after I went to www.yahoo.com (or open a new browser windows), charset menu changed to Armenian. Erik, could you take a look at this?
Assignee: nhotta → erik
I'll take it.
Assignee: erik → cata
My last comment, I was able to reproduce with an existing profile. But I have not been able to reproduce it with a new profile so far. Also, after I tested with a new profile then came back to the existing profile again I cannot reproduce it anymore, very strange.
Damn! I can't reproduce... Not with an old profile, nor with a new one. I went to Edit>Preferences>Fonts & did Naoki's steps and the default charset stays Western(ISO-8895-1). I am using Win2k, with a build from a pull from the branch ~1wk old. What are you guys using?
I am using today's branch build on Windows 2000. I was able to reproduce once but cannot reproduce it anymore.
OK, I've found a way to reproduce the reported behavior or the default charset being set to Armenian ARMSCII-8 (instead of Western ISO-8859-1). I've reproduced this on US Win95 and US W2K. I did this with a new profile on W2K. Will Scott Gifford, Steve Elmer and Jaime Rodriguez, please try to confirm is this is possibly what they did? The first time you open the Edit|Preferences...|Languages panel, you will notice that the item listed in the Default Character Coding dropdown menu flashes from an initial value of Armenian (ARMSCII-8) to Western (ISO-8859-1), that is because it is dynamically building (via RDF) the contents of the dropdown. If you have never opened the Languages pref panel and you click on the Languages category in the left-hand side, and then very quicly click on a different category (e.g., Fonts, History), you seem to interrupt the this building of the menu contents and it will be left as Armenian (ARMSCII-8). If you then click OK in another pref panel, then it will change your character coding default (intl.charset.default in prefs.js) as armscii-8. For the 3 folks that this has happened to, could you have done the above? I.e., clicked on Languages and then quickly clicked on Fonts?
It is possible that I accidentally did this, although I don't specifically recall it. It sounds like bug reports of this are trickling in occasionally. It might be smart to fix the bug that you just discovered, and see if that makes them stop trickling in. It also might be smart to add a little bit of code to detect a browser set to Armenian on the Mozilla login pages (which is where I and other reporters first encountered this bug), warn the user, and log the incident, so that it would be possible to see how often this is really occuring. I wasn't able to find a way to get the current character set from JavaScript or a CGI script; if somebody can point me in the right direction, I can try to come up with a patch and send it to the WebTools folks. The other theory being floated around over in bug #57946 is that a bug existed in some previous Mozilla build that set the encoding to Armenian, and left it there. I had been downloading and trying new versions on a pretty regular basis when I encountered this bug, so that is certainly also a possibility. When my CVS build finishes building, I'll play around and see if I can reproduce this in any of the ways described above. Thanks to all for their attention to this!
I came up with the "other" theory, but think my new theory is better since I can actually reproduce it. To do you check, you'd need to look at the pref "intl.charset.default", but you'd need a signed script to access the prefs.
This is the XUL that appears to be interrupted (line 87) and leaving the popup menu selection set to Armenian (ARMSCII-8). Why isn't the creation of the popup atomic? Seems like if it's interrupted, then nothing should be set. http://lxr.mozilla.org/seamonkey/source/xpfe/components/prefwindow/resources/content/pref-languages.xul#82 82 <menulist id="DefaultCharsetList" ref="NC:DecodersRoot" datasources="rdf:charset-menu" 83 pref="true" preftype="string" prefstring="intl.charset.default" 84 prefattribute="data" wsm_attributes="data"> 85 <template> 86 <menupopup> 87 <menuitem value="rdf:http://home.netscape.com/NC-rdf#Name" data="..." uri="..."/> 88 </menupopup> 89 </template> 90 </menulist>
Adding Ben, Hyatt and Waterson to Cc list. See the previous comment about atomicity of a XUL popup operation.
I don't think that the menu is "partially built", unless you're notifying us of languages asynchronously via the OnAssert() callback? I'd guess maybe the pref panel's onload and the menu's oncreate handler are fighting...
Keywords: intl
Since the comments indicate that the summary of this bug is actually not a bug, I am updating the summary to reflect what the problem really is. Marking mostfreq since this is often filed, but usually marked WORKSFORME or DUPLICATE of other WORKSFORME bugs.
Summary: non Western Character Coding -> Incorrect form/CGI encoding of parentheses and period (%28, %29, %2E) → Armenian character encoding picked if language dropdown doesn't completely populate [and thus can't login to bugzilla] [form submission encoding wrong]
*** Bug 63180 has been marked as a duplicate of this bug. ***
Keywords: mostfreq
*** Bug 61231 has been marked as a duplicate of this bug. ***
cc:in self
/me ponders how to describe this in one line on the mostfreq page :-) Gerv
Gerv: no kidding. I would suggest "Can't log in to Bugzilla (it says invalid e- mail address)" and, totally separately (but with the same bug #) "Mozilla switches to Armenian for no reason".
Changing OS to 'All' since this bug's been confirmed on several platforms: Windows 98, Windows 95, Windows 2000 and Linux.
OS: Linux → All
Somehow this might have been fixed, at least it hasn't showed up for me anymore. It disappeared a few weekes ago.. Is anyone else still seeing this? Else I think we should mark it WFM.
Sorry to spam everyone but there are people who are still seeing this problem. I have had at least one come on to irc complaining he cant submit stuff, I would leave this one open right now.
*** Bug 65894 has been marked as a duplicate of this bug. ***
Ksosez: tell your IRC friend that a fresh, new profile will do it. It worked fine for me. I really think we should close this now.
move all cata's bug to ftang
Assignee: cata → ftang
*** Bug 66625 has been marked as a duplicate of this bug. ***
Mark it as workforme.
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → WORKSFORME
Sorry, my mistake. We probably should still look at this one. reassign to nhotta.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
nhotta- can you fix this one?
Assignee: ftang → nhotta
Status: REOPENED → NEW
Target Milestone: --- → mozilla0.9.1
Changed QA contact to andreasb@netscape.com.
QA Contact: teruko → andreasb
Reassign to jbetak.
Assignee: nhotta → jbetak
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.1 → mozilla0.9
I've been unsuccessfully trying to reproduce this with old (from Sept-Oct/2000) and new builds. Although this is really frustrating, it just occurred to me that we might be able to diminish future risk by speeding up the drop-down menu build-up. I'm examining some pre-initialization / caching possibilities. Alternatively, I'll try to see, whether some changes have been made to the oncreate and onload XUL event handlers. The onload pref panel handler shouldn't be interrupted by the "create" event as waterson indicated.
ccing Joe Hewitt. Joe, could you please have a glace at this? The underlying cause for this problem seems to be in the widget state manager. Our situation is likely to be similar to the scenario in bug 62101. Our pref pane contains a long drop-down box, which gets filled from an RDF source and acts almost like a timer similar to what you described in 62101. What we have been seeing is that when switching between pref panes, sometimes our list doesn't get fully loaded and the wsm then caches the first entry from the list. When subsequently OK is clicked to store a change on some other pref pane, this unwanted pref state propagates to the user profile and continues to live there unnoticed until it wrecks havoc. From September 8 to February 14, this first drop-down list entry was a rather arcane character set, causing Mozilla do fail prominently in Bugzilla account logins. I was able to track down bugs caused by this between October 6 and January 18. I'm tempted to believe that your change to savePageData on January 16 might have helped to alleviate our situation and would like to solicit your opinion.
I finally succeeded in making both PR3 and 0922BASE debug builds. It seems that with the PR3 build, I can obtain an empty intl.charset.default preference by following Bob Jung's steps, which might or might not bring us closer to final resolution. In the debug builds I'm hitting an assertion "Failed to locate XBL binding" in nsXBLService.cpp on line 641, when abruptly leaving and returning to the language preference panel. It goes only away when I restart the browser. Only then will the charset drop-down box in the language pref panel build properly again.
marking dependency on bug 62101. I was able to consistently reproduce this problem with both the Netscape_20000922_BASE and Netscape_PR3_RELEASE builds. It seems that initially when pref panes are rapidly flipped, intl.charset.default is filled with a blank as described in 62101. When flipping pref panes again, upon reentry to the languages pane, the build-up of the charset drop-down listbox is disrupted and intl.charset.default is filled with the first list entry, which happens to be "Armenian (ARMSCII-8)". Debug builds throw two assertions, depending on the release: 1) nsXULDocument.cpp, line 5577 - "no script global object" mScriptGlobalObject != nsnull 2) "failed to locate XBL binding" in nsXBLService.cpp on line 641 When patching nsWidgetStateManager.js with Joe Hewitt's change from Jan 18, the whole process stops, although the assertions are still being thrown. nsWidgetStateManager.prototype = { get contentArea() { return window.frames[ this.contentID ]; }, savePageData: function ( aPageTag ) { + if (!(aPageTag in this.dataManager.pageData)) + return;
Depends on: 62101
Well, as much as I wanted to close this bug, I can't. Since I was not able to reproduce this with 100% consistency, I decided to do some more testing over the weekend. I'm getting better at it and just reproduced this bug on a tip build. Instead of "Armenian (ARMSCII)" we are now picking up "Arabic (IBM-864)", which is now the first item in the list of default character codings. I'm attaching a console output from my Netscape_20000922_BASE build. It's spiced up with some debug comments. It appears that the culprit is asynchronous load of external JS files from individual pref panes. When "dancing around" the preference window tree, the asynchronous load can fail due to the timing (racing?) conditions and the initialization JS code for the pref pane doesn't get executed. After a failed JS load, we start hitting the asserts. Given such circumstances the character coding list will never be set to the current value of the preference. It will be initialized with the first entry in the RDF source instead. This value is subsequently cached by the wsm, which appeared to be fully functional at all times.
I think we are getting closer to a resolution. Splitting up the pref- languages.js file and placing all of the initialization code into XUL makes this dialog much more robust. Frank suggested using a flag on the XUL file to indicate that the JS file has been fully loaded. We could pursue a route similar to pref-fonts.js, where the pref saving is handled by a callback function. There we could verify that the dialog was initialized properly before agreeing to save anynew preference values. A side note: when I increase the size of pref-search.js and place a delay in its initialization function, this bug can also be reproduced with the "Internet Search" preference panel. It seems to be an underlying issue with the preference code, which cannot handle multiple rapid requests without compromising data integrity. I'll file a bug for that against "Preferences" component.
Why would the bug be limited to Preferences? Seems like any XUL that is loading external JS files might be affected.
Bob, you are most likely right. I'd have to investigate some more, but my first reaction would be that preferences are just more susceptible due to transient state information caching. I might be wrong, but other parts of the UI might not cache and store transient state information in quite the same way.
moving to 0.9.1, hope to have this resolved with a week
Target Milestone: mozilla0.9 → mozilla0.9.1
QA Contact: andreasb → ylong
attaching a preliminary patch
The new patch offers a generic solution for improperly generalized pref panes. Ben insists on overseeing all nsPrefWindow.js changes, so I'm not sure whether we can get this in before he gets back from his vacation. I'm still refining the previous patch to pref-languages.js. I could move it to bug 41245, since it's a big rework of the current code and I'd favor a quicker resolution for this bug. Would anyone care to review?
No longer blocks: 41891
Whiteboard: need suprereview 2001-05-08 11:53
we really need to get this into 0.9.1 and 6.5 Beta. Please don't make me beg for r and sr. Ben?
sr=alecf on this band-aid fix we really need to get to the bottom of this race condition though, is there are bug around on that somewhere?
thanks for your help alecf - I'll try to come up with a reasonable test scenario and file a bug against Preferences (XP Toolkit?). Please note that the race conditions were extremely difficult to reproduce and the inflow of complaints stopped around January 16. Since I was able to reproduce this bug in a recent build, I wouldn't feel comfortable without some damage-control for Beta1.
Whiteboard: need suprereview 2001-05-08 11:53
marking dependency on follow-up bug 80868, marking fixed - thanks everyone! ylong: this one might be though to verify, please talk to bobj or myself should you run into trouble...
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Depends on: 80868
Resolution: --- → FIXED
Actually it's really hard to reproduce even on old build - I have to try very hard then I'll see this problem finally, but at least I haven't found once on 05-15 trunk yet. I'll mark it as verified, please re-open it if some one still can reproduce it.
Status: RESOLVED → VERIFIED
*** Bug 83411 has been marked as a duplicate of this bug. ***
*** Bug 83413 has been marked as a duplicate of this bug. ***
*** Bug 89107 has been marked as a duplicate of this bug. ***
*** Bug 91201 has been marked as a duplicate of this bug. ***
*** Bug 102179 has been marked as a duplicate of this bug. ***
*** Bug 103882 has been marked as a duplicate of this bug. ***
*** Bug 100915 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: