Closed
Bug 275124
Opened 20 years ago
Closed 13 years ago
Chatzilla/BeOS can't communicate after a while (UI remains responsive)
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: bugzillamozilla, Unassigned)
References
Details
Attachments
(1 obsolete file)
After an uncertain amount of time, cz dies. Typed text is only echoed locally,
nothing is actually sent. Similarly, nothing in the channel is displayed.
Log file of such a session:
http://beos.prognathous.mail-central.com/irc_terminal_log3.txt
Screenshot:
http://beos.prognathous.mail-central.com/cz_bug.png
In the above screenshot, "test 2:56" was typed in cz but wasn't received in the
channel (I verified this with another IRC client).
A few configuration notes:
Chatzilla v0.9.66e (Mozilla 20041216]
/pref debugMode t
browser.dom.window.dump.enabled=true
javascript.options.showInConsole=true
After adding javascript.options.strict=true, the following two warnings were
registered in the JS console during the next session (see irc_terminal_log4.txt
for the terminal output of this session):
Warning: function my_reclaimname does not always return a value
Source File: chrome://chatzilla/content/handlers.js
Line: 1730, Column: 14
Source Code:
return;
Warning: function my_reclaimname does not always return a value
Source File: chrome://chatzilla/content/handlers.js
Line: 1737, Column: 15
Source Code:
return true;
CZ window of this broken session:
#bezilla
[INFO] Channel view for “#bezilla” opened.
[MODE] User mode for Prog_0_9_66e20041216 is now +i
[JOIN] YOU have joined #bezilla
PrognathousMonitor test 3:30
Prog_0_9_66e20041216 test 3:31
PrognathousMonitor test 3:35
PrognathousMonitor test 3:44
PrognathousMonitor test 3:53
PrognathousMonitor test 4:01
PrognathousMonitor test 4:04
Prog_0_9_66e20041216 test 4:11 <-- NOT RECEIVED IN CHANNEL
Additional tests:
/eval client.eventPump.queue.length - client.eventPump.queuePointer
[EVAL-IN] client.eventPump.queue.length - client.eventPump.queuePointer
[EVAL-OUT] 4
/eval dumpObject(client.eventPump.queue[0])
[EVAL-IN] dumpObject(client.eventPump.queue[0])
[EVAL-OUT] set = server
type = rawdata
destObject = [object Object]
destMethod = onRawData
hooks =
data = PING :irc2.mozilla.org
queuedAt = Sat Dec 18 2004 04:07:58 GMT+0000 (GMT)
Thanks for looking into this bug,
Prog.
Comment 1•20 years ago
|
||
Is is chatzilla extension for FireFox?
Reporter | ||
Comment 2•20 years ago
|
||
This is a long standing bug that effects both the Chatzilla extension for
Firefox and the one built into Seamonkey. It happens with the latest version,
0.9.66e, as well as much older ones.
Silver suspects that the problem may be related to Necko.
Prog.
Comment 3•20 years ago
|
||
"After an uncertain amount of time, cz dies." - this is unclear statement form
the problem POV.
Does it mean - "After an uncertain amount of IDLE time" ?
If so, it is common problem for some servers, like freenode.org, for example.
Such servers require either activity from user or implementation of "ping/pong"
protocol in clients.
For example, in BeOS Baxter also dies in same way for some servers, while Vision
stays alive - as last is forcing ping-pong exchange with server.
Reporter | ||
Comment 4•20 years ago
|
||
(In reply to comment #3)
> Does it mean - "After an uncertain amount of IDLE time" ?
Not necessarily. Sometimes it happens after less than a minute, sometimes it
doesn't happen after an hour.
> If so, it is common problem for some servers, like freenode.org, for example.
Then why doesn't Chatzilla/Windows suffer from this problem?
Prog.
Comment 5•20 years ago
|
||
My initial stab at being Necko was basedo n what I'd heard about the problem
prior to any logs. It now looks more like a Mozilla timer issue. Here are some
more debug things to try...
When you start ChatZilla on BeOS, do the following two /evals:
/eval client.STEP_TIMEOUT = 10000
/eval mainStep = function() { client.displayHere("mainStep: BEGIN " + new
Date()); client.eventPump.stepEvents(); setTimeout("mainStep()",
client.STEP_TIMEOUT); client.displayHere("mainStep: END " + new Date()); }
The first will slow down CZ so it only processes events once every 10 seconds,
and the second will display a message to *client* both before and after each
processing, including the time for reference.
What would be interesting to know is a) do the pairs of messages still keep
appearing after it stops communicating, and b) if not, was the last one a BEGIN
or an END message?
Reporter | ||
Comment 6•20 years ago
|
||
Comment 7•20 years ago
|
||
From those logs, it looks distinctly like a timer issue in Mozilla's code.
client_log.txt is the key - notice that every |mainStep| BEGIN has a matching
END, so no unexpected JS exceptions occured, and more importantly, it /did/ run
the setTimeout line that calls itself each time.
It appears that the setTimeout set up just prior to "mainStep: END Sun Dec 19
2004 02:45:36 GMT+0000 (GMT)" simply didn't fire.
I guess the next level of debugging would be NSPR logging of the timer code...
unfortunately, it looks like you need to build w/ PR_LOGGING define before it
actually logs anything, though a normal debug build appears to have it defined.
Anyway, use the env var NSPR_LOG_MODULES=nsTimerImpl:5 to log timer stuff to the
console, but beware - there is a /lot/ of spew. :)
Comment 8•19 years ago
|
||
I am punting this over to Core: XPCOM because I am pretty sure this is not a CZ
bug (see previous comment). However, I don't actually know who's bug it is... it
could be a Spidermonkey setTimeout bug (unlikely IMHO, since this is
platform-specific), but could be a core timer bug in Mozilla (probably specific
to BeOS).
Can anyone other than Prognathous reproduce this on BeOS?
Assignee: rginda → dougt
Component: ChatZilla → XPCOM
Product: Other Applications → Core
QA Contact: samuel → xpcom
Version: unspecified → Trunk
Comment 10•19 years ago
|
||
Start suite with the -chat commandline parameter? That way you don't get the
main window (just the chatzilla one), which should help.
Comment 11•19 years ago
|
||
Saw from the comments that there was a Firefox extention. I will see what I can
do, this looks like a good bug in the summer heat :)
No longer depends on: 299058
Comment 12•19 years ago
|
||
Ok, confirmed (BeOS R5.03 BONE). Using the evals, it stuck after a few minutes.
Also got a disconnect on my other IRC-client (*** TQH_test
(chatzilla@moz-149A0DF3...) has quit IRC (Ping timeout)
I will turn on timer-logging and see what I can find.
Reporter | ||
Updated•19 years ago
|
QA Contact: xpcom → prognathous
Comment 13•19 years ago
|
||
I remember in one place in Mozilla long ago, microseconds and millisecons were
confused in setting and getting time intervals. Maybe this needs additional check
for consistency when we using Be API for time settings.
Comment 14•19 years ago
|
||
Well, BeOS NSPR has needed an overhaul for a long time, I got sidetracked from
debugging and started on that instead (I even rewrote atomic ops in asm for
x86). The most confusing file is beos.c which now only consists of one function
instead of lots of unused functions.
Comment 15•19 years ago
|
||
Got some assertions while I left Chatzilla running which might be interesting:
-2147265008[80035610]: ###!!! ASSERTION: forget-word-frame: '(void*)aFrame ==
mWordFrames->PeekFront()', file /mozdev/mozilla/layout/generic/nsLineLayout.cpp,
line 3027
###!!! ASSERTION: forget-word-frame: '(void*)aFrame ==
mWordFrames->PeekFront()', file /mozdev/mozilla/layout/generic/nsLineLayout.cpp,
line 3027
-2147265008[80035610]: ###!!! Break: at file
/mozdev/mozilla/layout/generic/nsLineLayout.cpp, line 3027
Break: at file /mozdev/mozilla/layout/generic/nsLineLayout.cpp, line 3027
-2147265008[80035610]: ###!!! ASSERTION: forget-word-frame: '(void*)aFrame ==
mWordFrames->PeekFront()', file /mozdev/mozilla/layout/generic/nsLineLayout.cpp,
line 3027
###!!! ASSERTION: forget-word-frame: '(void*)aFrame ==
mWordFrames->PeekFront()', file /mozdev/mozilla/layout/generic/nsLineLayout.cpp,
line 3027
-2147265008[80035610]: ###!!! Break: at file
/mozdev/mozilla/layout/generic/nsLineLayout.cpp, line 3027
Break: at file /mozdev/mozilla/layout/generic/nsLineLayout.cpp, line 3027
JavaScript error: chrome://chatzilla/content/chatzilla.xul, line 1:
contentAreaDNDObserver is not defined
Comment 16•19 years ago
|
||
The 'forget-word-frame' assertions occur on Windows as well, and are all layout
issues, which I don't believe have any relation to the bug here.
Updated•18 years ago
|
QA Contact: prognathous → xpcom
Comment 17•18 years ago
|
||
ChatZilla isn't only thing which dies here at time.
After a while, gif-animation stops, until you reload page.
Second suspicious thing - contents stops updating until you resize window. When BeZilla is just started, all is ok, but at some unpredictable moment, newly loaded pages don't show new content, or mails in mailnews main window don't show content if you switch between mails in list.
Looks like some timer is stopped or gone backward:)
Comment 18•17 years ago
|
||
Sergei in comment #17
> Looks like some timer is stopped or gone backward:)
fixed on trunk?
Comment 19•17 years ago
|
||
Is there any reason to believe it's fixed? Btw trunk has been completly broken for BeOS ever since they forced Cairo on us. I don't think it will ever get fixed...
Comment 21•17 years ago
|
||
this is definitely not fixed on trunk or branch. I'm beginning to suspect some kind of deep-rooted timer bug in BeOS-specific code. In addition to Chatzilla simply dying, Thunderbird ceases to automatically check for new mail when setting is configured in the Account/server preferences. We also have ongoing issues with some sites (usually SSL) hanging in Firefox but not SeaMonkey.
If anyone has ideas where to look, I'd appreciate a push in the right direction.
Comment 22•17 years ago
|
||
tqh suggests this problem may be solved by changing BeOS to use pipes instead of TCP socket pairs. This is how Unix does it. Working now to test.
Comment 23•17 years ago
|
||
Doug and I suspect it's because of polling using socketpairs, which isn't really implemented in BeOS and I guess the default implementation might time out. Doug is testing with the pipes-code for UNIX.
See:
http://lxr.mozilla.org/mozilla1.8.0/source/nsprpub/pr/src/io/prpolevt.c#409
Updated•17 years ago
|
Status: NEW → ASSIGNED
Comment 24•17 years ago
|
||
initial testing is positive, at least for Chatzilla. Does not seem to make a difference in Thunderbird mail retrieval.
Comment 25•17 years ago
|
||
This implementation seems to take care of hangs in Chatzilla. Ran for nearly 12 hours without hanging here.
Attachment #304281 -
Flags: review?(thesuckiestemail)
Comment 26•17 years ago
|
||
Comment on attachment 304281 [details] [diff] [review]
Change BeOS to use Pipes instead of TCP Sockets, as Unix does
r=thesuckiestemail@yahoo.se
Attachment #304281 -
Flags: review?(thesuckiestemail) → review+
Updated•17 years ago
|
Attachment #304281 -
Flags: review+ → review?(wtc)
Comment 27•17 years ago
|
||
Comment on attachment 304281 [details] [diff] [review]
Change BeOS to use Pipes instead of TCP Sockets, as Unix does
>+#if !defined(XP_UNIX) || !defined(XP_BEOS)
This should be && instead of ||. Did you test this patch? With this patch,
USE_TCP_SOCKETPAIR would still be defined on BeOS.
Attachment #304281 -
Flags: review?(wtc) → review-
Comment 28•17 years ago
|
||
actually, in BeOS R5 at least, pipes are quite flacky if not buggy.
That's the reason, for example, why we use pipefs replacement by mmu_man instead default pipe implementation, when working on big projects.
At least I had enumerous problem building Mozilla 4 years ago, until start to use that replacement.
Comment 29•17 years ago
|
||
OK. Feeling stupid here - tested patch which does nothing, apparently. But why, then, is the bug not occurring. More work needed (obviously).
Comment 30•17 years ago
|
||
Sergei, afaik it's the problem when the pipe gets full, which we dealt with in nsAppShell. Are there any other problems?
Comment 31•17 years ago
|
||
(In reply to comment #30)
> Sergei, afaik it's the problem when the pipe gets full, which we dealt with in
> nsAppShell. Are there any other problems?
>
in AppShell we use ports, not pipes. Ports are reliable, that's basis of all BeOS functionaly (as are in use by BMessages internally).
IIRC pipes have several problems, one of that was origin of big troubles in Apache porting, there was another too. I tried to search internet, but as Be Bug database is gone it was not so successful. Maybe better idea is to ask in mail lists. I think in Haiku problem is fixed, though.
Comment 32•17 years ago
|
||
Ah, then we probably should write a better one for BeOS.
Comment 33•17 years ago
|
||
Comment on attachment 304281 [details] [diff] [review]
Change BeOS to use Pipes instead of TCP Sockets, as Unix does
After applying the change correctly, BeOS Firefox and chatzilla hang MORE frequently. The trouble may still be somewhere in BeOS NSPR, but not here.
Attachment #304281 -
Attachment is obsolete: true
Comment 34•14 years ago
|
||
This is a mass change. Every comment has "assigned-to-new" in it.
I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.
Status: ASSIGNED → NEW
Comment 35•14 years ago
|
||
(In reply to comment #33)
> The trouble may still be somewhere in BeOS NSPR, but not here.
Doug, have you nailed what that might be?
Comment 36•14 years ago
|
||
I am no longer able to support the BeOS/Haiku Mozilla port and so cannot answer this question. Sorry I cannot be of more help.
BeOS is gone.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•