Closed
Bug 23709
Opened 25 years ago
Closed 25 years ago
[talkback]Crash in nsSocketTransport::OnFound on home.netscape.com cnn.com
Categories
(Core :: Networking, defect, P1)
Tracking
()
VERIFIED
FIXED
M14
People
(Reporter: alan-lists, Assigned: gordon)
References
Details
(Keywords: crash, top100, Whiteboard: [PDT+] 2/16/2000)
Attachments
(1 file)
(deleted),
text/html
|
Details |
Not sure when this started, but but it was in the last 2 weeks now when ever i
start mozilla it would crash in NECKO.DLL. After testing and testing I found I
do not crash if i just run
mozilla -mail I can play and have no problems in mail.
Then I set Mozilla to load a blank page and tried again. Mozilla loaded just
fine. Then i started loading html pages that I had on my local D drive. I was
able to load all sorts of local pages jumping around with no problem..
Again as soon as I tell mozilla to go to ANY site off my machine i get the crash
in Mozilla. I could then rerun mozilla and look at local stuff with no problem
crashing on outside stuff.
I am sure the console window won't help, but adding it just in case.
nNCL: registering deferred (0)
WEBSHELL+ = 1
WEBSHELL+ = 2
nsXULKeyListenerImpl::Init()
WEBSHELL+ = 3
WEBSHELL+ = 4
Setting content window
browser.startup.page = 0
startpage = about:blank
Document about:blank loaded successfully
Document: Done (2.31 secs)
got a request
WEBSHELL+ = 5
FindShortcut: in='www.mozilla.org' out='null'
=============================================
MOZILLA caused an invalid page fault in
module NECKO.DLL at 014f:60507e5b.
Registers:
EAX=8b2307d0 CS=014f EIP=60507e5b EFLGS=00010202
EBX=00000000 SS=0157 ESP=020afd8c EBP=020afdcc
ECX=0139caf0 DS=0157 ESI=01357df8 FS=0f87
EDX=8165a9cc ES=0157 EDI=0139caf0 GS=0000
Bytes at CS:EIP:
ff 30 56 e8 c9 31 00 00 83 c4 0c eb 05 bb 05 40
Stack dump:
00000004 00000001 006e2420 00000000 6050a57a 01357db8 00000000 014e8340 006e2428
00000001 006e2420 00000000 6050a98f 020afdd4 00008e42 020afe1e
Comment 1•25 years ago
|
||
By the way, what build were you running? M12, a nightly, roll your own?
Just to remove this as a possible cause: what happens if you completely blow
away the old .\Seamonkey (for installer) or .\bin (for nightly) directory and
then reinstall to a clean directory. (This may be what you do anyways, but
I did see another bug report, which, although not for necko.dll, did involve
a component loading problem problems that was caused by old cruft in the
.\components directory).
Reporter | ||
Comment 2•25 years ago
|
||
Ooops, I should have mentioned the builds. I have tried the Win32
builds of 01/09/00, 01/10/00/, and 01/11/00 (this on was the win32 installer
build). There may have been some problems before those builds, but I don't
rembember right now.
When I try a new build I always delete the mozilla .dat file in the windows
directory, and also the users50 directory, and the entire \bin\ directory with
the actual program.
Reporter | ||
Comment 3•25 years ago
|
||
Not sure if this is related or not but on the 1/13/99 nightly build i realized i
am also getting this justbefore i crash:
JavaScript Error: uncaught exception: [Exception... "Illegal value" code: "-214
7024809" nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)" location: "chrome://si
debar/content/sidebarOverlay.js Line: 201"]
I am going crazy not being able to test Mozilla. Anything anyone would suggest
to check? Version numbers, ect?
Updated•25 years ago
|
Severity: normal → critical
Updated•25 years ago
|
Target Milestone: M14
Reporter | ||
Comment 4•25 years ago
|
||
It sounds like this bug and bug 24008 are dupes or at least very close. I have
a Windows 95 P120 with 64MB RAM.
So far this bug is a WORKSFORME, tever can you try and reproduce this reliably?
Thx.
Reporter | ||
Comment 7•25 years ago
|
||
On bug 24008 that I think is a dup
lchiang@netscape.com commented "If you need to see this, contact
suresh@netscape.com"
Is there anything else I can do at my end to help find it? I don't have VC6.0,
but if pointing me to a debug build would help i can try it also.
Comment 8•25 years ago
|
||
I believe this is the same as or related to the problem I have been having with
the nightly builds for the last or two (not sure when it first started
happening).
My setup:
Build: 2000011708 (and earlier build but not M12)
OS: Windows 95 - 4.00.950 B
Platform: Pentium w/ 64 MB RAM
I deleted the /bin directory from the previous build as well as the
MOZREGISTRY.DAT file, and the Moz profile before installing. I invoke
MOZILLA.EXE without any command options.
It does not crash on www.mozilla.org but on other URLs like www.slashdot.org.
It always crashes at different stages in rendering of the page sometimes not
even crashing at all. It always crashes with a invalid page fault but in any of
three possible modules with the following debug info:
MOZILLA caused an invalid page fault in
module NECKO.DLL at 014f:604f7f31.
Registers:
EAX=0000000c CS=014f EIP=604f7f31 EFLGS=00010202
EBX=00000000 SS=0157 ESP=015dfd8c EBP=015dfdcc
ECX=01733330 DS=0157 ESI=01733408 FS=32a7
EDX=8163c674 ES=0157 EDI=01733330 GS=0000
Bytes at CS:EIP:
ff 30 56 e8 83 31 00 00 83 c4 0c eb 05 bb 05 40
Stack dump:
00000004 00000001 00e66d14 00000000 604fa622 017333c8 00000000 01730d50 00e66d1c
00000001 00e66d14 00000000 604faa16 015dfdd4 00008e42 015dfe1e
MOZILLA caused an invalid page fault in
module WS2_32.DLL at 014f:00661c27.
Registers:
EAX=00000000 CS=014f EIP=00661c27 EFLGS=00010246
EBX=00736314 SS=0157 ESP=016fff1c EBP=016fff38
ECX=014a6768 DS=0157 ESI=00736304 FS=2aaf
EDX=00000000 ES=0157 EDI=014a6750 GS=0000
Bytes at CS:EIP:
89 06 89 46 04 89 46 0c 89 45 f0 39 01 74 08 ff
Stack dump:
0066e3d8 0125c25c 014a6750 016fff6c 7800cc32 00000009 00000038 016fff5c 00661a4a
00736304 00000400 014a6750 0125c230 004115bc 0066e3d8 00000000
MOZILLA caused an invalid page fault in
module MSVCRT.DLL at 014f:780016b2.
Registers:
EAX=743d6574 CS=014f EIP=780016b2 EFLGS=00010297
EBX=00000000 SS=0157 ESP=015dfd74 EBP=015dfd7c
ECX=00000001 DS=0157 ESI=743d6570 FS=433f
EDX=00000000 ES=0157 EDI=0125da98 GS=0000
Bytes at CS:EIP:
8b 44 8e fc 89 44 8f fc 8d 04 8d 00 00 00 00 03
Stack dump:
0125da98 011f5640 015dfdcc 604f7f39 0125da98 743d6570 00000004 00000001 00de4cc4
00000000 604fa622 0125da58 00000000 014c4480 00de4ccc 00000001
The DOS shell usually says something like the following:
-->snipped<---
WEBSHELL+ = 5
FindShortcut: in='www.slashdot.org' out='null'
nsLayoutHistoryState::GetState, ERROR getting History state for the key
nsLayoutHistoryState::GetState, ERROR getting History state for the key
WEBSHELL+ = 6
More info, including stack trace, found in bug
http://bugzilla.mozilla.org/show_bug.cgi?id=24008. I am not marking these as
dupes and will leave up to QA contact or Eng to do so.
Comment 10•25 years ago
|
||
I am getting the same crash as specified by Darrel on 1/17. I am consistently
getting the crash. I have deleted everything that has to do with moz and
reinstalled and it still crashes. I have tried builds from today (20000121) and
from yesterday.
Reporter | ||
Comment 11•25 years ago
|
||
We had made the assumption that this bug was a dupe of bug 24008, but we never
marked it as such(being cautious). Well warren fixed bug 24008 on 1/21/2000.
It should have landed in the late M14 build on 1/21/2000. I pulled the
1/22/2000 build and still have the crash in Necko.dll I mentioned above. Anyone
have any ideas about this?
Comment 12•25 years ago
|
||
adding myself to cc list, excuse the spam. Expecting additional comments from
another user who has been seeing the same thing since earlier this month.
Comment 13•25 years ago
|
||
I am using MS DUN 1.3 128version, on Compaq (Dec'97) Cyrix G180 cpu (180mhz)48Mb
(4shared) with Win95 OSR 2B (Fat32), The build from 9:40 1/22/2000 crashes if I
attempt to access any site not set at startup.(ie start with mozilla.org
anything else mozillazine.org, slashdot.org crash. start blank, mozillazine.org
crashes) This started happening around the 8th of Jan. I have been deleting
moz*.dat, bin & users50 directories. Converting profile & using mozprofile no
difference. I usually use RamBoost v1.6 & have even tried turning it off but no
change.
Comment 14•25 years ago
|
||
I still can not reproduce this on any of my win 95 machines. Will get ahold of
Suresh for assistance.
Reporter | ||
Comment 15•25 years ago
|
||
Tever, I hope to get my hands on a debug build for an M13 build with full circle
to see if I can get more info. So far i have not been able to get a debug build
yet so may have to wait till M13 Full Circle.
Comment 16•25 years ago
|
||
I still see the crash on loading certain web pages like home.netscape.com,
www.cnn.com. Some pages loads fine.
Build used: 2000-01-25-14-M13 on Win 95, 133 Mhz, 64 MB ram.
Stack Trace: (Incident Id: 4440221)
Call Stack: (Signature = nsSocketTransport::OnFound 8ba136db)
nsSocketTransport::OnFound
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsSocketTransport.cpp, line 1470]
nsDNSLookup::CallOnFound
[d:\builds\seamonkey\mozilla\netwerk\dns\src\nsDnsService.cpp, line 297]
nsDNSEventProc
[d:\builds\seamonkey\mozilla\netwerk\dns\src\nsDnsService.cpp, line 394]
KERNEL32.DLL + 0x3663 (0xbff73663)
KERNEL32.DLL + 0x228e0 (0xbff928e0)
0x01208e3c
Assignee | ||
Comment 17•25 years ago
|
||
Weird, from that stack crawl it looks like it's dying inside of a PR_Log call in
nsSocketTransport::OnFound(), but I don't see how that's possible. Do you have
logging turned on?
Where are you located? Can I come see your machine?
Comment 18•25 years ago
|
||
*** Bug 24008 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 19•25 years ago
|
||
I just got the final M13 build with Full Circle in it. Ran it, crashed and sent
the info off. I don't know who checks the Full Circle info or how it is passed
on to people. In description area I put the bug number along with info on who
it was assigned to, the QA contact on this, ect....
I hope you all can get this and it will help.
Comment 20•25 years ago
|
||
asj's stack track looks like suresh@netscape.com's
Incident ID 4498637
nsSocketTransport::OnFound
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsSocketTransport.cpp, line 1470]
nsDNSLookup::CallOnFound
[d:\builds\seamonkey\mozilla\netwerk\dns\src\nsDnsService.cpp, line 297]
nsDNSEventProc [d:\builds\seamonkey\mozilla\netwerk\dns\src\nsDnsService.cpp,
line 394]
KERNEL32.DLL + 0x3663 (0xbff73663)
KERNEL32.DLL + 0x228e0 (0xbff928e0)
0x01e48e3c
looks like he had maybe gotten low on virtual memory.
that might help to explain the randomness of this failure
that folks are seeing.
Operating System: Windows 95 4.0 build 67109814
Service Pack: -
Physical Memory: 64.0 MB
Memory Status:
Available
Total
Physical Memory:
1.8 MB
64.0 MB
Page File:
236.1 MB
278.2 MB
Virtual Memory: 1996.1 MB
2044.0 MB
Screen Information: 1600 x 1200, 16 bits per pixel
Keywords: beta1
Summary: Crash in NECKO.DLL → top100 Crash in NECKO.DLL home.netscape.com cnn.com
Comment 21•25 years ago
|
||
Updated•25 years ago
|
Summary: top100 Crash in NECKO.DLL home.netscape.com cnn.com → [top100][talkback]Crash in nsSocketTransport::OnFound on home.netscape.com cnn.com
Comment 22•25 years ago
|
||
can someone try visiting an IP address using a build on win95? Any IP addr will
do, here's sun's 192.18.97.195. I think we're having buffer alloc problems in
the dns service.
Comment 23•25 years ago
|
||
re-assigning to gordon. Here' the PR_LOG stmt that is failing (I'm assuming
we're failing here (maybe a bad assumption; but I have nothing else to go on).
PR_LOG(gSocketLog, PR_LOG_DEBUG,
("nsSocketTransport::OnFound(...) [%s:%d %x]."
" DNS lookup succeeded => %s (%d.%d.%d.%d)\n",
mHostName, mPort, this,
aHostEnt->hostEnt.h_name,
mNetAddress.inet.ip & 0xff,
(mNetAddress.inet.ip >> 8) & 0xff,
(mNetAddress.inet.ip >> 16) & 0xff,
(mNetAddress.inet.ip >> 24) & 0xff));
The only real variable here that could choke a printf would be if
aHostEnt->hostEnt.h_name wasn't null terminated. I checked the IP address
specific code and it seems to be doing the right thing (always null
terminating).
However, aHostEnt->bufLen is *always* some bugus number. I've fixed (haven't
checked in, gordon can you?) the IP addr case:
Index: src/nsDnsService.cpp
===================================================================
RCS file: /cvsroot/mozilla/netwerk/dns/src/nsDnsService.cpp,v
retrieving revision 1.27
diff -r1.27 nsDnsService.cpp
679c679
< PRIntn bufLen = PR_NETDB_BUF_SIZE;
---
> PRIntn bufLen = hostentry->bufLen = PR_NETDB_BUF_SIZE;
But the non-IP addr case is still bogus.
Assignee: gagan → gordon
Comment 24•25 years ago
|
||
Jud: I don't think we're crashing in the log statement -- the linenumber must be
wrong. PR_LOG expands into an if (<enabled>) { <then print> } kind of thing, and
they don't have logging turned on so this code isn't getting executed. The only
other thing in this method is the memcpy -- that's got to be the problem.
Comment 25•25 years ago
|
||
agreed. we must be trying to copy more that we should be. I'm not able to see
anything obvious in the dns code. I'd like to know if it's repro w/ IP addresses
(different code path in dns) before anyone kills themself trying to verify dns
host ent copy code and the joyous bufalloc stuff.
Reporter | ||
Comment 26•25 years ago
|
||
As requested i tested with M13 Full Circle with IP Address. It did not crash
loading Sun (192.18.97.195) or Netscape's (205.188.247.66) site with the IP
address.
It it did crash loading CNN's page (207.25.71.246)
I sent off the M13 Full Circle test data referencing this bug.
On a side note I found just before i crash my console screen fills with the
following error over and over:
nsLayoutHistoryState::GetState, ERROR getting History state for the key
nsLayoutHistoryState::GetState, ERROR getting History state for the key
Hope it helps
Assignee | ||
Comment 27•25 years ago
|
||
Okay, Jud and I have a handle on this now. We just need to reorder the fields in
nsHostEnt so that the bufLen and bufPtr fields aren't overwritten when
WSAAsyncGetHostByName fills in the data. Jud and I are exchanging diffs. I can
check in the fix when the tree opens.
Status: NEW → ASSIGNED
Assignee | ||
Comment 30•25 years ago
|
||
Fix was checked in last night, so we should be able to verifying in today's build
when it is ready. Marking fixed.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 31•25 years ago
|
||
I hate to say this.... i really hate to say this, but not sure this is fixed.
Had several different errors on first set of tests so I restarted my computer
and ran today's latest 1/28/99 win32 build with nothing else.
This time crashed before page loaded and got TWO Dr Watons errors.
First
MOZILLA caused an invalid page fault in
module NECKO.DLL at 014f:605587e7.
Second
MOZILLA caused an invalid page fault in
module GKPARSER.DLL at 014f:602b69a9.
Then restared again and another set of tests.
This time the sites all almost completly load when the crash occures.
Some Top 100 Site forget the name
MOZILLA caused an invalid page fault in
module WS2_32.DLL at 014f:008a1c27.
CNN
MOZILLA caused an invalid page fault in
module NECKO.DLL at 014f:605587e7.
www.icq.com
MOZILLA caused an invalid page fault in
module NECKO.DLL at 014f:605587e7.
The other people that saw this bug before are you still getting the crash?
Any thing i can do to help?
Assignee | ||
Comment 32•25 years ago
|
||
This is really ugly. The bug Jud and I fixed yesterday was a serious problem
related to this section of code, but it appears to have had no impact on this
bug.
It seems an x86 register is getting trashed within a series of nine instructions.
We aren't calling any functions, and memory is not trashed (I can manually
retrace the steps the computer "should" have taken, and end up with the correct
result). The problem is intermittent; most DNS lookups complete just fine. All
DNS lookups go through this code. I have forced several engineers to look at
this crash, and no one has an explanation yet. I believe evil spirits may be the
cause.
It happens on Suresh's Compaq 5133 Deskpro. Alan, what kind of machine are you
on? Tom, have you been able to reproduce this yet? I'm reopening this bug, and
look for an x86 expert.
Status: RESOLVED → REOPENED
Comment 33•25 years ago
|
||
It's not an MP machine, is it?
Can you see who's setting the register value, and map that back to the
high-level code that's being executed? Maybe it's a compiler bug.
Assignee | ||
Comment 34•25 years ago
|
||
It's not an MP machine. I doesn't appear to be a compiler bug because the code
executes correctly the vast majority of the time. It may be possible that
VStudio is displaying memory in a bogus way, but it seems strange that the bogus
memory would look so correct. I'll put some additional sanity checks in for
debug purposes to try and verify what VStudio is telling me. I'll also take a
look at the Talkback reports to see if they give me more accurate information.
Comment 35•25 years ago
|
||
Sorry for spam, interested in this bug.
Reporter | ||
Comment 36•25 years ago
|
||
I am running a P120, Win95, 64 megs of ram with MS DUN 1.3, ect....
As far as brands, it is a generic type thing..... Starting out as a Midwest
Micro machine. Later replaced bad motherboard with a Achme Botherboard #5156
(by Micro-Star International MSI) PCI TX4 w 512 K cache and Award BIOS
I am running both 1 onboard seral port and using an extra board for a serial
port on a higher IRQ (for my PalmPilot). Not that it matters, but just incase
it is a SIIG Fast EIDE Controller. I have most functions on the board disabled.
Are you now thinking it is a hardware issue vs Software? Curious how many
others were using MS DUN (Dial up networking) 1.3 for Win 95. Saw another
person wit his crash mention DUN 1.3.
Reporter | ||
Comment 37•25 years ago
|
||
I am running a P120, Win95, 64 megs of ram with MS DUN 1.3, ect....
As far as brands, it is a generic type thing..... Starting out as a Midwest
Micro machine. Later replaced bad motherboard with a Achme Botherboard #5156
(by Micro-Star International MSI) PCI TX4 w 512 K cache and Award BIOS
I am running both 1 onboard seral port and using an extra board for a serial
port on a higher IRQ (for my PalmPilot). Not that it matters, but just incase
it is a SIIG Fast EIDE Controller. I have most functions on the board disabled.
Are you now thinking it is a hardware issue vs Software? Curious how many
others were using MS DUN (Dial up networking) 1.3 for Win 95. Saw another
person wit his crash mention DUN 1.3.
Comment 38•25 years ago
|
||
re CCing myself, as asj was kind enough to remove me accidentally...
Comment 39•25 years ago
|
||
I am getting a similar crash on my machine: Win95 AMD 333Mhz 64Mb
Mail works well most of the time but I get crashes regularly loading web-pages.
I can load some simple pages but all other pages generate a crash in Necko.dll
at some stage during the page load. I can load local pages without problems.
Also I tried loading Sun's page; it loads perfectly when using the IP address
but crashes during load when using www.sun.com.
The figure that go with the crash are below:
MOZILLA caused an invalid page fault in
module NECKO.DLL at 0137:60547f17.
Registers:
EAX=65726464 CS=0137 EIP=60547f17 EFLGS=00010202
EBX=00000000 SS=013f ESP=00cbfd8c EBP=00cbfdcc
ECX=010bf100 DS=013f ESI=010be068 FS=4a87
EDX=816588f4 ES=013f EDI=010bf100 GS=0000
Bytes at CS:EIP:
ff 30 56 e8 07 32 00 00 83 c4 0c eb 05 bb 05 40
Stack dump:
00000004 00000001 007433e8 00000000 6054a61a 010be028 00000000 01085f10 007433f0
00000001 007433e8 00000000 6054a9b4 00cbfdd4 00008e42 00cbfe1e
Comment 40•25 years ago
|
||
scenario 1/29/2000 power on machine, dial ISP, start mozilla 13:07 01/28
mozilla.org loads ok, www.weather.com ok click on current temperatures link
crash 014f:605587de in NECKO.DLL
I know that mozilla & Communicator are completely different but they are still
using the same OS & dialer so... in 4.7 or earlier I would get crashes in
RNR20.DLL which AFAIK has to do with DNS addresses. Can the installer provide
a RNR20.DLL file? can someone provide a link to get this file (currently mine
is 4.10.15110)? Different versions on different machines could be part of this
problem. any suggestions? Thanks tom
Comment 41•25 years ago
|
||
Sunday AM scenario, power on , dial ISP, start mozilla, mozilla.org ok
us IP address for weather (206.151.166.121) page loads ok, > 170 seconds, click
on current temperatures, no crash temps not shown after long time,
go to ISP mail maint screen, delete spam ok , go to bugzilla post this
Build Id: 20000012812 . Makes me think timing (activity on Internet) part of
problem???
Thanks Tom
Comment 42•25 years ago
|
||
*** Bug 25791 has been marked as a duplicate of this bug. ***
Comment 43•25 years ago
|
||
Just an update on my circumstances under which Mozilla crashes (see previous
post), I now find it crashes in module NECKO.DLL at 014f:605587e7 (different
location) and occasionally in modules WS2_32.DLL at 014f:00661c27 and MSVCRT.DLL
at 014f:780016b2 as it did previously.
Thinking that this might be related to Windows 95 only, (Is anyone with Windows
98 experiencing this?) I tried several different things:
I have MS Winsock 2 installed. I can't find an elegant way of going back to the
original Winsock, short of reinstalling Windows. So I can't figure out if this
is the cause of the problem. Is anyone running MS Winsock 1 (ie. the one that
is initial under Windows 95) that experiences this problem?
I've installed three patches to Winsock 2: vipup11, vipup20, and vtcpup20.
Rebooted and it still crashes.
I tried toggling DNS caching on and off, see MS knowledge base article Q174614
(Don't flame me if this an IE and not a Winsock issue). It still crashes but it
seems to take longer to crash when DNS caching is off (perhaps coincidentally).
Initially I was crashing using an ethernet connection that uses a client manager
with the PPPoE protocol. I also tried using a PPP dial-up connection but it
still crashes although again it seemed to take longer (ie. I might have to try
two or three different sites before it crashes).
This info might be totally useless but I thought it might help.
Comment 44•25 years ago
|
||
Sorry, that was as of build 2000012808. All user and platform information is
the same as previously posted.
Comment 45•25 years ago
|
||
Gordon, No I still can not reproduce this on my machines. I tried again using 2
fairly minimal win 95 machines in the lab. Also tried stressing the virtual
memory - no crash like described. Checked todays 01/31 build and an older one.
Comment 47•25 years ago
|
||
Build ID: 2000013111 crashes Necko.dll at 014f:6055880c which is a slight change
from 605587de...
Comment 48•25 years ago
|
||
calling all folks who see this bug to add info about connection speed
and other info about their network configuration.
Comment 49•25 years ago
|
||
Notice that these crashes are all from Win95 with the same build number
(OSR2??). Perhaps there is something peculiar to that build (or DUN 1.3???). I
do not see this on Win98.
Comment 50•25 years ago
|
||
Ok make room for me too. I'm on Win95 OSR2, DUN 1.3 update, Winsock 2 update,
various other updates, no IE whatsoever. Pentium 200MHz MMX, 64MB RAM, 56K
modem that connects all the time at 44K. I get crashes in those exact same
files. I can browse around mozilla.org to my heart's content, but leaving it
and going somewhere else kills Mozilla 99.9% of the time. And not always at the
same point in loading the page. On only a couple occasions has the page loaded
completely but is quite rare and would die on the next link clicked. Had some
success stopping page before it completely loaded and avoid the crash. Also,
because I'm still on Win95 I've been religious about updates that seemed
important so there might be another update involved that is causing problems
besides the DUN 1.3 (and WS2?) updates that it seems most of us having this
problem have applied. I will continue to sent Talkback reports as it happens
but it's pretty much the same ol' thing over and over and makes Mozilla
virtually useless on my Win95 system at home, so I'd love to try anything the
developers would like me to in order to test this. Have done it enough to feel
confident there are no other files that it's crashing in besides msvcrt.dll,
rdf.dll, ws2_32.dll, and necko.dll. Let me know how I can help.
Comment 51•25 years ago
|
||
Well, I seem to be a counterpoint: I am running OSR2 (4.00.950B) and I do
*not* crash. In fact, the current builds are *very* stable (for me).
I recently blew up my hard drive, and this spare disk is a pretty stock
install of win95 on a IBM Thinkpad from Apr 97; (It actually hadn't been
booted since early '98 (was sitting on a shelf)).
I have not upgraded DUN (rnaapp.exe 4.00.1111) or WINSOCK (wsock32.dll
4.00.1111). IE4.0+ has never been installed on this computer (with all
it's unknown collection of DLLs).
I am using 28K dialup, PCMCIA modem, with DHCP, without WINS. [I'll mention
for completeness that I run Netware as well, but that can't be relevant].
I'd be happy to provide any other info you need. John.
Assignee | ||
Comment 52•25 years ago
|
||
Thanks everyone for all the help on this. The configuration information is
proving to be very interesting. I still haven't worked out the connection
between the susceptible configurations and register EAX getting trashed. We have
a machine in house that can reproduce the problem, and I'm poking it to see
precisely what conditions are necessary for the crash. I'll post more later this
afternoon. Thanks again.
Reporter | ||
Comment 53•25 years ago
|
||
I realized I had not put my connectoin speed in my previous information about my
system. I am on a 56K modem that with strange lines so the connect goes
anywhere from 28.8 to 33.6. (I hope they will fix the lines soon).
Comment 54•25 years ago
|
||
It really seems to be an Win95 problem. My Mozilla crashes every 3rd Webpage or
so.
I'm using Win95 (4.00.950b) with updated msdun13, vtcpup20, vipup20.
So it may be somehow connected to these Updates.
Good luck
Comment 55•25 years ago
|
||
Win95 4.00.95B (FAT32) 128byte version of DUN 1.3, with associated updates...
usally connected at 31200(33600 modem) 44MB RAM available to OS
Comment 56•25 years ago
|
||
i have m13 (zipfile and fullcircle) and win95b (with all the required updates
.. vtcupd/etc.winsock 2.2) .. dsl connection and i can't get the bloody thing to
load one page :( .. it crashes alot *necko.dll* 48MB of ram.. Ive tried this
also with some m14 nightly's
Assignee | ||
Comment 57•25 years ago
|
||
*** Bug 25102 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 58•25 years ago
|
||
Rick Potts and I have verified that WSAAsyncGetHostByName occasionally posts a
notification indicating the results are complete, BEFORE it has filled out the
hostent. nsSocketTransport::OnFound() tries to dereference garbage, but by the
time we look at the data structures in the debugger, they have all been nicely
fixed up. Also, inserting printf()'s into OnFound() alters the timing, making
the bug "go away", or much more rare.
We are continuing to investigate the boundaries of the problem, and develop
possible solutions. We hope to have more information tomorrow.
Assignee | ||
Comment 59•25 years ago
|
||
Last night Rick investigated a bit further and refined the current hypothesis.
I was able to confirm it this afternoon. The root of the problem is the version
of winsock on the troubled platforms always returns 1 for calls to
WSAAsyncGetHostByName, when it should be returning a unique ID that can be used
to identify which lookup has completed. Thus when we have multiple lookups
outstanding, we have no way of knowing which lookup completes. The nsDNSService
therefore picks the first lookup in the list, which may or may not be the
correct one, and may or may not be complete yet. Of course by the time the
debugger kicks in, all outstanding lookups have completed, so the data
structures all look fine.
We need to identify which versions of winsock have this "feature".
Priority: P3 → P1
Comment 60•25 years ago
|
||
qfecheck.exe reports: UPD970624R1 Windows Socket API Update:
Winsock.dll 4.10.0.1511
Wsock32.dll 4.10.0.1511
UPD971126B1 TCP Driver Update
VTCP.386 4.10.0.1657
Build 2000020214
last Necko.dll crash was 014f:60558807 EAX=7373654d
Reporter | ||
Comment 61•25 years ago
|
||
I pulled version numbers (right click) to every winscock or tcp/ip driver I
could think of.
c:\windows\winsock.dll 4.10.1656
c:\windows\system\wsock.vxd 4.10.1656
c:\windows\system\wsock2.vxd 4.10.1656
c:\windows\system\wsock32.dll 4.10.1656
c:\windows\system\wsock32n.dll 5.2.0.2
c:\windows\system\vtcp.386 4.10.1657
c:\windows\system\vnbt.386 4.10.1658
c:\windows\system\vip.386 4.10.1657
Comment 63•25 years ago
|
||
Winsock version 4.10.1656 (windows95B 4.00.1111)
Comment 64•25 years ago
|
||
I don't know if this is any use to anybody, but I've had problems before with
winsock 2 (not related to mozilla or netscape). The way to remove winsock 2 is
to: restart in DOS mode, cd /windows/ws2bakup, then run ws2bakup.bat and reboot.
Updated•25 years ago
|
Summary: [top100][talkback]Crash in nsSocketTransport::OnFound on home.netscape.com cnn.com → [talkback]Crash in nsSocketTransport::OnFound on home.netscape.com cnn.com
Comment 65•25 years ago
|
||
Yes... It appears that both versions of Winsock2 for Win95 (1511 and 1656) have
broken a WSAAsyncGetHostByName(...).
For both of these versions, it appears that the HANDLE that is returned by
WSAAsyncGetHostByName(...) is *always* 1. Of course this makes managing
multiple outstanding requests impossible :-(
The easy fix is to *remove* winsock2 :-)
I have not been able to reproduce this problem on Win98, WinNT 4.0 or Win95
running old winsocks (ie. not Winsock 2.0)
I also looked at the code for Communicator 4... The code is quite different
because it maintains a local DNS cache. However, there is a secondary validity
check for (hostent_h_name != NULL) that appears to minimize the problem :-)
Comment 66•25 years ago
|
||
Okay, my version numbers are below. Note however that I am one of those getting
an error at 0137:... not 014f:.... (others are in Bug 25102 which has been
marked as a dupe of this bug). I have marked the files whose version numbers are
different to the similar listing filed previously.
c:\windows\winsock.dll 4.10.1998 (different!)
c:\windows\system\wsock.vxd 4.10.1656
c:\windows\system\wsock2.vxd 4.10.1656
c:\windows\system\wsock32.dll 4.10.1656
c:\windows\system\wsock32n.dll 5.1.0.2 (different)
c:\windows\system\vtcp.386 4.10.1657
c:\windows\system\vnbt.386 4.10.1658
c:\windows\system\vip.386 4.10.1658 (different)
Comment 67•25 years ago
|
||
*** Bug 25431 has been marked as a duplicate of this bug. ***
Comment 68•25 years ago
|
||
potts and I talked about not relying on the HANDLE as the index into the lookup
entry table. instead we could do a strcmp on the actual host returned, against
our hosts in the lookup table we cache. if we find a match, we're covered. This
"solves" the winsock2 problem, maybe not the crash???
Comment 69•25 years ago
|
||
Could all of this nonsense possibly be due to the fact that the windows dns code
isn't thread safe? See bug 27496.
Assignee | ||
Comment 70•25 years ago
|
||
No. Winsock2 is broken on Win95. The thread safety issue is a separate problem.
Assignee | ||
Comment 71•25 years ago
|
||
Jud, string compares would only work if we are coalescing multiple requests for
dns lookups for the same hostname. Otherwise we have the same problem as
winsock2 on win95 where identical HANDLEs are returned for different lookups.
The two solutions that I've discussed with Rick are either using a range of event
messge IDs to identify which lookup has completed, or simply test
WSAAsyncGetHostByName to see if it returns unique HANDLEs and revert to
synchronous PR_GetHostByName if it doesn't. I think gracefully degrading to
synchronous on win95 systems with winsock2 installed is probably the best
approach. We should probably have the dns thread make the call to
PR_GetHostByName so that the socket transport can continue working.
Assignee | ||
Comment 72•25 years ago
|
||
Fix checked in last night.
Status: REOPENED → RESOLVED
Closed: 25 years ago → 25 years ago
Resolution: --- → FIXED
Comment 73•25 years ago
|
||
Suresh verifies this to be working on his Win95 system using build 2000022108.
Marking verified.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•