Open Bug 554055 Opened 15 years ago Updated 5 years ago

network connection timeout ETIMEDOUT misreported as ECONNREFUSED

Categories

(MailNews Core :: Networking, defect)

x86
All
defect
Not set
major

Tracking

(Not tracked)

People

(Reporter: mozilla, Unassigned)

Details

(Whiteboard: dupeme)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.1.6) Gecko/20091216 Fedora/3.5.6-1.fc11 Firefox/3.5.6 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.7) Gecko/20100120 Fedora/3.0.1-1.fc11 Lightning/1.0b1 Thunderbird/3.0.1 If a mail server is unreachable (router on the path offline), thunderbird reports "connection refused" instead of "connection timed out". Confirmed by running a tcpdump alongside: no RST packet whatsoever. Reproducible: Always Steps to Reproduce: 1. Configure iptables on mail server to DROP packets from test client to imap port 2. Run tcpdump on client 3. Click on a mail folders that lives on the test imap server Actual Results: An error box with "connection refused" shows up after a while. Tcpdump shows no RST packet. Expected Results: It should say "connection timed out" or something similar
Whiteboard: dupme
It just occurred to me: Maybe Thunderbird is changing the messages to "similar" messages in order to be better able to translate them? But at least in this particular case, this is not needed, as a properly localized glibc is able to supply these kind of (locally generated) messages in whatever language is needed. Moreover, even in case translated system messages are not available, there is no point in making such a substitution if the user's has actually chosen English.
yeah, i do believe this is a duplicate
Summary: ETIMEDOUT misreported as ECONNREFUSED → connection timeout ETIMEDOUT misreported as ECONNREFUSED
Whiteboard: dupme → dupeme
Still an issue in 7.0.1 . If this is a duplicate of which bug number would it be a duplicate?
Component: General → Untriaged
Component: Untriaged → Networking
Keywords: qawanted
Product: Thunderbird → MailNews Core
Summary: connection timeout ETIMEDOUT misreported as ECONNREFUSED → network connection timeout ETIMEDOUT misreported as ECONNREFUSED
*could* be this code: different errors are bundled into NS_ERROR_CONNECTION_REFUSED. (But there may be other places where the problem may arise, also.) line 153 of https://dxr.mozilla.org/comm-central/source/mozilla/netwerk/base/nsSocketTransport2.cpp case PR_CONNECT_REFUSED_ERROR: // We lump the following NSPR codes in with PR_CONNECT_REFUSED_ERROR. We // could get better diagnostics by adding distinct XPCOM error codes for // each of these, but there are a lot of places in Gecko that check // specifically for NS_ERROR_CONNECTION_REFUSED, all of which would need to // be checked. case PR_NETWORK_UNREACHABLE_ERROR: case PR_HOST_UNREACHABLE_ERROR: case PR_ADDRESS_NOT_AVAILABLE_ERROR: // Treat EACCES as a soft error since (at least on Linux) connect() returns // EACCES when an IPv6 connection is blocked by a firewall. See bug 270784. case PR_NO_ACCESS_RIGHTS_ERROR: rv = NS_ERROR_CONNECTION_REFUSED; break; That is bad. We can rewrite code where Gecko checks specifically for NS_ERROR_CONNECTION_REFUSED we can define a function like the following: bool Is_NS_ERROR_CONNECTION_REFUSED_or_friends(nserror e) { switch (e) { default: return false; break; case PR_CONNECT_REFUSED_ERROR: // We lump the following NSPR codes in with PR_CONNECT_REFUSED_ERROR. We // could get better diagnostics by adding distinct XPCOM error codes for // each of these, but there are a lot of places in Gecko that check // specifically for NS_ERROR_CONNECTION_REFUSED, all of which would need to // be checked. case PR_NETWORK_UNREACHABLE_ERROR: case PR_HOST_UNREACHABLE_ERROR: case PR_ADDRESS_NOT_AVAILABLE_ERROR: // Treat EACCES as a soft error since (at least on Linux) connect() returns // EACCES when an IPv6 connection is blocked by a firewall. See bug 270784. case PR_NO_ACCESS_RIGHTS_ERROR: return true; } } then modify error_code == NS_ERROR_CONNECTION_REFUSED => Is_NS_ERROR_CONNECTION_REFUSED_or_friends(error_code) and error_code != NS_ERROR_CONNECTION_REFUSED => (! Is_NS_ERROR_CONNECTION_REFUSED_or_friends(error_code)) The above should fix the issue. There are ONLY 29 places where NS_ERROR_CONNECTION_REFUSED appears in C-C source tree (actually they are all under M-C portion) and about half a dozen is where NS_ERROR_CONNECTION_REFUSED is checked by "==". So should be doable.
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: qawanted
OS: Linux → All

(In reply to Wayne Mery (:wsmwk) from comment #2)

yeah, i do believe this is a duplicate

I don't remember why I thought this. Have you seen evidence of other reports? Does it have implications for other bug reports?

Flags: needinfo?(gds)

No, I haven't seen evidence of other similar reports. Given my comment 3, I presumably searched for some, but didn't find any, so I asked. But hard to tell, as this was 8 years ago.
No idea whether it has implications of other bug reports, but I do notice that firefox tends to misreport error conditions quite often, in other areas as well.

About comment 4: If more than one error code require same treatment within Firefox, but different reports to the user, maybe one solution would be a major/minor code structure, where decision are based solely on major code, but messages use the minor code as well.

Google "site:bugzilla.mozilla.org ETIMEDOUT ECONNREFUSED" and there are a couple bugs shown that are similar but classed as WONTFIX.

Flags: needinfo?(gds)

(In reply to Alain Knaff from comment #6)

No, I haven't seen evidence of other similar reports. Given my comment 3, I presumably searched for some, but didn't find any, so I asked. But hard to tell, as this was 8 years ago.
No idea whether it has implications of other bug reports, but I do notice that firefox tends to misreport error conditions quite often, in other areas as well.

About comment 4: If more than one error code require same treatment within Firefox, but different reports to the user, maybe one solution would be a major/minor code structure, where decision are based solely on major code, but messages use the minor code as well.

I tend to notice firefox/thunderbird fails to report error conditions at all (especially I/O errors, since the result of Read()/Write() not checked at all in many places. It is as if the programmers who wrote code lived in a virtual world where disks/SSD, etc. never fail and filesystem never gets corrupted, user never mistypes a filename, etc. Even places where the error conditions are checked, they are not reported explicitly to the users via UI interaction mechanism in many instances.

So the user is left with "errors not being reported/checked" and only when the underlying errors messed up mail store badly in TB's case, for example, one is forced to realize "Something went bad." But the user is left with no indication of when and what. This makes developers' life miserable, too since debugging and homing on the cause of the bug is very difficult after the fact.).

I was looking at https://bugzilla.mozilla.org/show_bug.cgi?id=777292 to check which previous patches touched the code that consolidates / lumps together the several error codes into NS_ERROR_CONNECTION_REFUSED when I noticed
https://bugzilla.mozilla.org/attachment.cgi?id=646574&action=diff
(Attachment #646574 [details] [diff]: netwerk/: Don't treat number of bytes as an nsresult for bug #777292 ).
I was appalled.
The few instances of ...->Read() I saw all can fail due to file system errors, etc. But no error checking is done in all the instances. Sigh.
(The first one seems to be a dummy read for a space holder, but others seem to contain something valid and a Read() error ought to be signaled back then and there as error return IMHO.)

You need to log in before you can comment on or make changes to this bug.