Closed
Bug 54081
Opened 24 years ago
Closed 24 years ago
Data corruption uploading binary data (such as GIF's)
Categories
(Core :: Networking, defect, P2)
Tracking
()
VERIFIED
FIXED
People
(Reporter: jesup, Assigned: darin.moz)
References
(Blocks 1 open bug)
Details
(Keywords: dataloss, Whiteboard: [rtm++][dogfood+][nsbeta3-])
Attachments
(18 files)
(deleted),
image/gif
|
Details | |
(deleted),
image/gif
|
Details | |
(deleted),
image/gif
|
Details | |
(deleted),
image/gif
|
Details | |
(deleted),
image/gif
|
Details | |
(deleted),
image/gif
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
application/octet-stream
|
Details | |
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
image/gif
|
Details | |
(deleted),
image/gif
|
Details |
FreeBSD 4.1 20000925xx
When uploading a 40K GIF to bugzilla, I found when viewing it the image stopped
after showing part of it. Closer inspection shows that the data uploaded
differs at regular intervals from the original.
My file differed at ~0x489f to ~0x5e7a, ~0x708b to ~0x7e7a, and ~0x8d0f to
~0x9e7a. The end of each section of corruption is very interesting (0x...e7a).
Probably closely related to bug 47936, which is nsbeta3+ dogfood+ and supposedly
fixed. I wonder if this is a regression due to that fix.
Reporter | ||
Comment 1•24 years ago
|
||
Reporter | ||
Comment 2•24 years ago
|
||
Reporter | ||
Comment 3•24 years ago
|
||
Dataloss
Nominating for nsbeta3, dogfood
qawanted for other platforms (I can't run win95 dailys for the last week).
Reporter | ||
Comment 4•24 years ago
|
||
Reporter | ||
Comment 5•24 years ago
|
||
upload with Win32 2000092708 was ok. Will retest with fresh pull/build inder
FreeBSD today
This may be a dup of bug 53341. See additional comments from me there
Reporter | ||
Comment 7•24 years ago
|
||
Reporter | ||
Comment 8•24 years ago
|
||
Reporter | ||
Comment 9•24 years ago
|
||
No joy. The errors are still there, but the exact position doesn't seem
identical from attempt to attempt. I'm going to re-check with win32. 53341 is
marked Linux, so maybe this is a general Unix (but not win32) problem? Sorry,
my Win32 box has become unstable and can't even download Mozilla - I'll try
again later. ANy other Win32 testers would be welcomed.
OS->Linux since the problem is seen there, and there's no catch-all "Unix" entry
(and Linux reports get more attention).
OS: FreeBSD → Linux
Comment 10•24 years ago
|
||
Uploading images is a very common use of a browser. Even if we stay with the PG
rated crowd, folks still push images of their family members etc. The
work-around is to use another browser... which makes this dogfood-plus.
I think it is now too late to make the beta3 train, so I'm with sadness marking
this beta3-, but putting in an rtm nomination, and bumping to P2 priority (this
is MAJOR functionality!!)
Comment 11•24 years ago
|
||
This does seem to be a dupe of Bug 52030, but this bug has gotten more and
better attention. Not marking dupe.
Comment 12•24 years ago
|
||
Spam: fixing broken whiteboard
Whiteboard: [dogfood+][nsbet3-] → [dogfood+][nsbeta3-]
Reporter | ||
Comment 13•24 years ago
|
||
Reporter | ||
Comment 14•24 years ago
|
||
The retest with win32 worked again, so I'm assuming it's a Linux/FreeBSD/Unix bug.
Reporter | ||
Comment 15•24 years ago
|
||
Adding eric pollman since he's assigned 53341 which is a dup of this (or vice
versa). 52030 is also a dup. I'm going to close both of those out as dups and
let this one be the flag-carrier (since it's now dogfood+).
Eric or Gagan - who should have this? Is it networking or forms?
Reporter | ||
Comment 16•24 years ago
|
||
*** Bug 53341 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 17•24 years ago
|
||
*** Bug 52030 has been marked as a duplicate of this bug. ***
Comment 18•24 years ago
|
||
Randell: thanks for these test cases and yes its going to be one of us (pollmann
for forms and me for networking)
Eric: You want to take the first crack at this?
Assignee: gagan → pollmann
Comment 19•24 years ago
|
||
Marking rtm+.
Whiteboard: [dogfood+][nsbeta3-] → [rtm+][dogfood+][nsbeta3-]
Comment 20•24 years ago
|
||
Marking [rtm need info] Waiting for patch to be attached, review + super-review.
Whiteboard: [rtm+][dogfood+][nsbeta3-] → [rtm need info][dogfood+][nsbeta3-]
Comment 21•24 years ago
|
||
*** Bug 54463 has been marked as a duplicate of this bug. ***
Updated•24 years ago
|
Blocks: input-helper-apps
Comment 22•24 years ago
|
||
When posting multipart form data a temporary file is being written and then that
file is read by netlib and sent.
I wrote a small C application that reads in the formpost multipart file and
writes out the data portion. This way I can view the actual gifs that were
written into the formpost file. I have uploaded several files one that was
almost 3 megs and the data in the formpost looked fine.
I can only conclude at this point that it is a netlib issue. I will attach my
little helper app.
Comment 23•24 years ago
|
||
Comment 24•24 years ago
|
||
Comment 26•24 years ago
|
||
I did all my initial testing on WinNT and just noticed the bug is for Linux. So
I just tested it on Linux with the little helper app and I was able to see the
gif that was inside the formpost data file just fine.
Comment 27•24 years ago
|
||
Comment 28•24 years ago
|
||
Please feel free to test using the above testcase - it will upload a file to a
web server I have set up to save the result to a file. The file can then be
displayed and/or deleted by clicking on a link in the resulting page. (Thanks
to Rod for the idea!) There are about 4 MBytes of space free on the server.
Comment 29•24 years ago
|
||
Ack, ignore the above test case - it doesn't work (Looking into why, probably
not related to this bug).
I will attach a second testcase momentarily that does an equivalent thing. I
was able to attach a large .gif image with no corruption using the new testcase
on my FreeBSD build. (Pulled yesterday from the tip) Can anyone confirm that
this corruption bug exists in recent build?
Comment 30•24 years ago
|
||
Comment 31•24 years ago
|
||
This bug still exists in Linux on build 2000100809.
I attempted to upload a screenshot for a bug of mine and it was corrupted. The
URL is http://bugzilla.mozilla.org/showattachment.cgi?attach_id=16477
Also adding self to cc list
David
Comment 32•24 years ago
|
||
Comment 33•24 years ago
|
||
Okay, as you can see in the last post, small files are okay ( < about 60K ), but
after that they become corrupted. This is Linux Build 2000101206.
Comment 34•24 years ago
|
||
We have no patch for this rather serious bug. Is anyone on the scent of this
defect? If so, please let me know and I will start trying to track it down. I
don't think I'll get very far starting from scratch.
Comment 35•24 years ago
|
||
I presume this is a regression could we try and find out when it first
happened?
Marking relnote: ~Binary images seem to be corrupted, please use ns6.01.~
Keywords: relnote,
relnoteRTM
Comment 36•24 years ago
|
||
I must not understand this right. In the release note of Netscape 6, we're going
to tell them to use 6.01 (which doesn't exist yet) to solve a *known* data loss
problem?
Am I really reading this right?
Assignee | ||
Comment 37•24 years ago
|
||
Confirming on Linux build 2000101509. The formpost-XX temporary file also
appears to to have been generated correctly, and when I upload it, I find
the file similarly corrupted.
Investigating Necko...
Comment 38•24 years ago
|
||
Well, since this one is fixed in 6.01, does anyone have any idea about the
patch?
Has anyone traced the dataflow to find where the corruption begins?
This is one of only two "corruption" bugs with votes at the moment.
Assignee | ||
Comment 39•24 years ago
|
||
I am working on tracing the dataflow. So far, I found that it is
not a problem with nsHTTPEncodeStream, b/c I can send data using
application/x-www-form-urlencoded (which bypasses nsHTTPEncodeStream)
that also gets corrupted. That means the problem must lie in the
actual data transfer, and that's where I'm currently investigating.
Assignee | ||
Comment 40•24 years ago
|
||
Ok, this looks like a transport problem. The corruption occurs after the first
80470007 [NS_BASE_STREAM_WOULD_BLOCK] error. The bytes later written (when the
socket becomes writable) are corrupted.
Take a look at the following attachments...
1) the nsSocketTransport log file.
2) the formpost (as generated by nsFormFrame.cpp) stripped of its header.
3) the content section of the HTTP transaction (as received by the server).
Assignee | ||
Comment 41•24 years ago
|
||
Assignee | ||
Comment 42•24 years ago
|
||
Assignee | ||
Comment 43•24 years ago
|
||
Assignee | ||
Comment 44•24 years ago
|
||
What I should also mention is that if you count the number of bytes transmitted
(by looking at attachment 17565 [details]) before the first 80470007 error, you'll get
a total of 55024 bytes. If you "diff" the formpost before (attachment 17566 [details])
and after (attachment 17567 [details]) transmission, you'll find a discrepancy at byte
54897. What's going on? Well, the transmission also includes a header of 127
bytes, and 54897 + 127 equals 55024. So, it seems like whenever we can't write
all the data without blocking, we're going to get into trouble.
Reporter | ||
Comment 45•24 years ago
|
||
Could the buffer be being deallocated after WOULD_BLOCK, or the buffer pointer
not bumped, or bumped incorrectly?
Comment 46•24 years ago
|
||
Great investigative work darin! Since this problem dwells in nsSocketTransport,
do you believe that it also affects FTP upload and SMTP?
Assignee | ||
Comment 47•24 years ago
|
||
It's hard for me to say what is actually the culprit here. It'd be nice to have
someone, who knows nsSocketTransport, to take a look at this problem. From
just looking at the log file and the source code, it's not obvious (at least
not to me) where the problem is occuring.
This looks similar to bug 47936... can someone familar with that bug take a
look. This is probably a regression of that bug.
I would assume this bug would cause problems with any "large" socket writes,
including FTP upload and SMTP, so this is definitely an important bug to fix!
Comment 48•24 years ago
|
||
So, it sounds like this is a bug with the AsyncWrite case of the
nsSocketTransport...
Is this bug really Linux only? Or do all platforms fail if a WOULD_BLOCK occurs
during the write?
-- rick
Comment 49•24 years ago
|
||
Since PR_Write == PR_Send and can return less # of bytes then requested, but not
WOULD_BLOCK - we may want to check if the code behaves correctly in such case
Assignee | ||
Comment 50•24 years ago
|
||
It looks like we are properly handling the case that ruslan mentions.
From nsSocketTransport.cpp(1083):
len = PR_Write(fd, fromRawSegment, count);
if (len > 0) {
*writeCount = (PRUint32)len;
}
So, irrespective of the error condition, we always remember how many bytes
were written.
Assignee | ||
Comment 51•24 years ago
|
||
This fix turns out to be a one-liner!! We were ignoring the toOffset
parameter passed to nsWriteToSocket. This parameter is used when
nsWriteToSocket is called from doWriteFromStream (which occurs when
we do a form POST and probably in many other cases as well).
I'm attaching a patch to nsSocketTransport.cpp which fixes this. r=?
Assignee | ||
Comment 52•24 years ago
|
||
Comment 53•24 years ago
|
||
I seem to remember another time when we proposed this change, and warren thought
that it was wrong...
I don't remember why -it seems like the right thing to do...
warren, what do you think?
-- rick
Assignee | ||
Comment 54•24 years ago
|
||
If you look at the way nsPipe2.cpp would call nsWriteToSocket (in
ReadSegments), for example, you would see that using the toOffset
parameter in this way makes sense. The thing that bothers me, though,
is why on earth is it called toOffset if it is to be used in this
manner? It seems like it should be called fromOffset or something
like that... but, maybe I'm missing something?!?
Comment 55•24 years ago
|
||
I don't think that's right. toOffset is the offset in the destination, not the
source. The source buffer offset is adjusted for you by the pipe code's
implementation of ReadSegments.
Or is this being called from some other implementation of ReadSegments? Ever
since we combined nsIBufferInputStream with nsIInputStream, there's more of
these around the codebase. Maybe the problem is in the caller.
Assignee | ||
Comment 56•24 years ago
|
||
You're totally right, and that makes more sense. Then the correct thing
to do in the case of nsSocketTransport is to similarly pre-offset the
read buffer before calling nsWriteToSocket. In doWriteFromStream this
effects the code as such:
nsWriteToSocket(nsnull, mSocketFD, mWriteBuffer, mWriteBufferIndex, ...
should become:
nsWriteToSocket(nsnull, mSocketFD, mWriteBuffer + mWriteBufferIndex, 0, ...
and then nsWriteToSocket can ignore the toOffset parameter as before. What
do you think?
Comment 57•24 years ago
|
||
Right, that explains the problem. Have you verified that the call to
nsWriteToSocket at line 1371 is the only offender?
Also, it's not that nsWriteToSocket ignores the toOffset, but that toOffset
really doesn't make any sense for sockets. They're not things you can offset
into.
Assignee | ||
Comment 58•24 years ago
|
||
Yes.. the only other place it gets called is through nsIPipe::ReadSegments.
I'm attaching a new patch.
Assignee | ||
Comment 59•24 years ago
|
||
Comment 60•24 years ago
|
||
r=warren
Comment 61•24 years ago
|
||
Comment 63•24 years ago
|
||
sr=mscott
just to be safe, Darin, can you try sending yourself a message via SMTP just to
make sure that writing data to the socket in the smtp case isn't adversley
effected by this change?
Assignee | ||
Comment 64•24 years ago
|
||
SMTP checks out ok (~300k attachment w/o corruption)
Comment 65•24 years ago
|
||
rtm++
Whiteboard: [rtm need info][dogfood+][nsbeta3-] → [rtm++][dogfood+][nsbeta3-]
Comment 66•24 years ago
|
||
cc: esther - we should try this out once fixed for mail attachments.
Assignee | ||
Comment 67•24 years ago
|
||
Landed patch on the trunk and patch per PDT approval.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 68•24 years ago
|
||
Sorry, I meant... the patch is on the trunk and the BRANCH...
Comment 69•24 years ago
|
||
Comment 70•24 years ago
|
||
verified on branch:
WinNT 1024
Linux rh6 1024
Mac os9 1024
used test case provided and attachment for uploading, also checked SMTP
needs checked on trunk
Comment 72•24 years ago
|
||
verified on trunk
WinNT 2000013120
Linux rh6 2000013106
Mac os8.6 2001010209
Status: RESOLVED → VERIFIED
Keywords: vtrunk
Assignee | ||
Comment 73•24 years ago
|
||
*** Bug 42047 has been marked as a duplicate of this bug. ***
Updated•23 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•