Closed Bug 138157 Opened 23 years ago Closed 22 years ago

Publish takes 99% CPU

Categories

(SeaMonkey :: Composer, defect)

x86
Windows 2000
defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.0

People

(Reporter: lasse, Assigned: dougt)

Details

(Keywords: perf, Whiteboard: publish [adt2])

Attachments

(2 files)

Win2K, 2002041711, 1.0.0 branch. The publish feature in Composer takes 99% CPU after a successful publish via FTP. It kicks in just as the transfer finishes, and stays until you exit Mozilla completely. To reproduce: 1. Create or edit a page in composer 2. File - publish 3. Put in ftp info and publish Result: Windows task manager reports mozilla.exe as using 99% cpu until I exit mozilla completely. Things get slow.. 100% reproducible for me so far.
Strange. Just tried this on my home machine (winXP, 2002041711 branch) and publish doesn't seem to work at all. The dialog is there, but there is no reaction when I push the "publish" button.
--> cmanske, brade
Assignee: syd → cmanske
Lasse, please try again...publishing works for us...
OK, just reproduced this on my winXP machine. First time I tried I put "fpt://..." instead of "ftp://.." in the settings. That caused mozilla to simply do nothing - no warning or anything, just closed the dialog like nothing happened. That's probably a different bug. This doesn't seem to affect performance much on this machine (1,4GHz, 1GB ram). When other programs need cpu mozilla happily gives away, but then it creeps back to 99%. I was even able to do filtering in Photoshop, which caused mozilla to go as low as 25% cpu for a second or two, then back up. OK, what info do you need to check this? Screenshots? My ftp settings? I've reproduced this on two different machines now. I have a third, also winXP, I can give it a try there.
Lasse, the mistyped "fpt" issue is going to be covered in bug 126258 . I cc'd you on that bug...That bug deals with error handling when things aren't typed correctly in the publish dialog.
I am able to reproduce this problem on the 04-18 trunk and 04-17 1.0.0 builds. But like Lasse said, it does not appear to affect the computer performance.
Thanks sujay. Michael: What OS are you seeing this on? So far I have only been able to test windows 2K and XP.
I am on Win 2k right now. I just tried Mac OSX, and it appears to be working fine.
I've been testing some more, and it seems it has some effect on performance, though not as much as one would think from the numbers. Specifically window animation on maximize/minimize is jerky, not as smooth as it usually is. Also I felt the computer was generally slower on my work machine where I discovered this - that one is a PIII450. I tried a different ftp-server and got some strange results that might be of interest: I created a small test file, with just the word "test". I uploaded it, and it reported a successful transfer, and the cpu jumped to 99% as usual. I left composer open in the background, and tried to open the page in navigator. The url was resolved, but the page was blank. Checked with IE, same result. View source (in IE) gave me this: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META http-equiv=Content-Type content="text/html; charset=windows-1252"></HEAD> <BODY></BODY></HTML> About 15 minutes later an alert popped up in composer saying: 426 Connection closed; transfer aborted After this I tried to reload the page in both mozilla and IE, both now reported that the page could not be found after previously having shown the page as a blank page. Using another ftp-program I verified that the page was nowhere to be found. Now this is the interesting part: The cpu went back to normal! OK, now what do I do?
Also able to reproduce this bug on 2002041711 branch, Win 98.
Nominating to make sure this doesn't get lost as a potential problem. Is it just Lasse's setup, or will this be a common problem?
Keywords: nsbeta1
Keywords: perf
I can reproduce this on Windows2000. I tested different senarios and this only happens after FTP (not HTTP0 publishing. I skipped creating the Publish Progress dialog, so I don't think there's anything in Composer code to cause it.
Assignee: cmanske → bbaetz
Whiteboard: publish
Can I get some ftp protocol logs? This WFM on linux.
Also able to reproduce this bug on branch 2002041711, Mandrake Linux 8.2 I am publishing to ftp using a username and password, and the test file I am using (test.html) was placed on the ftp server correctly, along with an associated graphic. I have tried putting in the corresponding URL in "http address to browse to" and also just leaving that blank. Although the publishing completes successfully, the CPU utilization goes to 99% and stays there until I kill the mozilla-bin process. Also, when I try to use the browser to check the test page, it does not show any changes (from the last publishing) until after I have killed mozilla and restarted it.
To create this log, I started Composer, opened a page from Recent Files menu and publishished this page via FTP. This page has one image. I'm doing this on my home machince, which has dual processors, and I'm seeing 50% CPU usage in this case (Windows 2000).
can someone get a socket transport log?
Keywords: adt1.0.0
Whiteboard: publish → publish[adt2]
Doug: how does one get that? Can't you produce one by using publishing yourself?
Target Milestone: --- → mozilla1.0
No patch with r= sr=, so removing adt1.0.0 nomination. No nsbeta1+, so removing adt status whiteboard entry.
Keywords: adt1.0.0, nsbeta1nsbeta1-
Whiteboard: publish[adt2] → publish
Keywords: nsbeta1-nsbeta1
Keywords: nsbeta1nsbeta1+
Whiteboard: publish → publish [adt2]
DougT has agreed to look at this. Stack after breaking when in the maxed-out CPU state: NTDLL! 77f96be2() WS2_32! 75032f2f() _PR_MD_PR_POLL(PRPollDesc * 0x012e6230, int 7, unsigned int 1914360) line 240 + 35 bytes PR_Poll(PRPollDesc * 0x012e6230, int 7, unsigned int 1914360) line 157 + 17 bytes nsSocketTransportService::Run(nsSocketTransportService * const 0x012e610c) line 469 + 24 bytes nsThread::Main(void * 0x012e20f0) line 120 + 26 bytes _PR_NativeRunThread(void * 0x012e2210) line 433 + 13 bytes
Assignee: bbaetz → dougt
I found the problem. It was my fault in assumpting that releasing a transport request was enough to have it get cancelled. My thinking was that if there were no outstanding references outside of the transport, it should be killed. It ain't that way.... When we recv the pasv, we create a connection to the remote service prior to issuing any control commands. This is connection is require to avoid some security problems and some servers enforce that a data connection must exists prior to responding to any subsequent control commands. This data connection is opened via the nsITransport as a read request. When we are storing a file, we release this read request and open a new write request. However, we are not cancelling the read request. It just idles in our socket pool. Why it pegs the cpu is another bug. (darin?) In any case, I think that the right thing to do is cancel the read request with a success code. It turns out that this does fix the problem as described by the bug. When implementing this, we have to make sure that we do not send multiple onStart/onStop notification.
I tested Doug's patch and it fixes the bug for me. I'm using a dual processor machine and thus only saw one processor "peg", and it doesn't do that now.
Comment on attachment 81181 [details] [diff] [review] cancels inital read request when storing. Why are you removing the ->Cancel call in the OnStop? I added that in to make sure taht we closed files on the server end when uploading to windows servers, so that the user can see the uploaded file before the ftp session times out. I don't know if it actually fixed that problem, mind you. Cmanske? Brade? Did I end up fixing that, and does this patch regress that? r=bbaetz if that didn't regress (ie it still works, or it didn't work in teh first places)
I can confirm this CPU usage problem on win95b Pentium 200MMX, with build: mozilla-win32-installer-sea.2002042908.1.0.0.exe method: a.create tiny test page b.save as to local drive eg: c:\smalltest.html c.publish Observered: 1.dialog flashes ?uploading? in its status , then a few seconds later success, and then the publish dialog hides automatically. 2.TaskInfo shows %CPU goes from 10 or so to 95 then 99.9% at the point that the dialog is hidden. Similar to Lasse, the pc is still responsive, ie. The CPU is given over to other things you do as needed. 3. Attempting to http: the uploaded page results in the following text in browser window: The process cannot access the file because it is being used by another process. or view source: <html><head><title>Error</title></head><body>The process cannot access the file because it is being used by another process. </body></html> Also, the internet connection Tx/Rx lights do not appear to flash at all during this period. 4.Stays this way for about 6 minutes, then following modal dialog displayed: Alert 426-Maximum disk quota limited to 10240 Kbytes Used disk quota 7803 Kbytes, available 2436 Kbytes 426 Data connection closed, receive file smalltest.html aborted. Also, the CPU usage goes back to normal 10-20%. 5.Can now retrieve the uploaded page in the browser, but it has not been properly uploaded: Completely blank browser page is shown source is: <html><body></body></html> Note: by manually deleting a few files from my ftp site, and then republishing, I can show that the Alert info actually represents the space used/available on my web site. Hence the ftp transfer is at least communicating with the ftp server, and supplying the correct credentials as entered in the dialog. However, the smalltest.html size is only 346 bytes, so it should fit. Without further proof, it seems that a. the ftp transfer is holding the file open on the ftp/web server (3). b. no data is transferred during the interval while cpu usage is ~100%. c. the page actually uploaded is an empty page. "Self dons cap of pyschic (?) abilities" Is it possible that the page getting published is the default (new) page, rather than the required page ? More guesses: Publish gets the file size of the real file eg 384 bytes, but then starts putting data from the internal (new) page ~ 30 bytes, and hence the transfer never completes because the response for 384 bytes Tx'ed is never received; eventually something times out (maybe ftp server closes his connection ?), and this is interpreted incorrectly as server space full, since the (wrong) file size is being compared. Possibly remove keyword perf since the CPU usage is only an asside to the real problem: fails to publish to the ftp site?
>Why are you removing the ->Cancel call in the OnStop? I added that in to make sure taht we closed files on the server end when uploading to windows servers, so that the user can see the uploaded file before the ftp session times out. I don't know if it actually fixed that problem, mind you. Cmanske? Brade? Did I end up fixing that, and does this patch regress that? The reason that the cancel was needed in the first place was that the read request was still open. Cmanske, Brade, are you running with this patch? If not, you should.
Comment on attachment 81181 [details] [diff] [review] cancels inital read request when storing. r=bbaetz
Attachment #81181 - Flags: review+
Comment on attachment 81181 [details] [diff] [review] cancels inital read request when storing. sr=darin
Attachment #81181 - Flags: superreview+
Checking in nsFtpConnectionThread.cpp; /cvsroot/mozilla/netwerk/protocol/ftp/src/nsFtpConnectionThread.cpp,v <-- nsFtpConnectionThread.cpp new revision: 1.239; previous revision: 1.238 done Fixed checked into trunk
Checking in nsFtpConnectionThread.cpp; /cvsroot/mozilla/netwerk/protocol/ftp/src/nsFtpConnectionThread.cpp,v <-- nsFtpConnectionThread.cpp new revision: 1.239; previous revision: 1.238 done Fixed checked into trunk
Marking FIXED since the patch got checked into the trunk Adding adt1.0.0 to get ADT's attention for 1.0 branch checkin
Status: NEW → RESOLVED
Closed: 22 years ago
Keywords: adt1.0.0
Resolution: --- → FIXED
With build: mozilla-win32-installer-sea.2002050104.trunk.exe, ftp publish succeeds and CPU usage returns immediately to the norm after the upload for win95b. Also checked publish with images -> OK. (although a warning in ui that "image publish to directory that does not exist" would be better response than ?"Some files failed to transfer" without a reason. Or given time: create the directory automatically (mkdir)if it is non-existent on the web server. Publish As is also operational, (using the same details as publish). Guys: A definite fix for me (at least)! We now have working publish capabilities on the trunk.
verified per comments. Lasse, if this is not fixed, let us know...otherwise we'll assume it works for you now...
Status: RESOLVED → VERIFIED
Working great for me in May 1 trunk build. Thanks!
adding adt1.0.0+. Please check this in to the branch as soon as possible after getting drivers approval. Then add the fixed1.0.0 keyword.
Keywords: adt1.0.0adt1.0.0+
has anyone requested driver approval for this fix?
yes, I waited up till 1 am for a response.
Comment on attachment 81181 [details] [diff] [review] cancels inital read request when storing. a=scc (on behalf of drivers) for checkin to the Mozilla1.0 branch
Attachment #81181 - Flags: approval+
? nkftp_s.lib Checking in nsFtpConnectionThread.cpp; /cvsroot/mozilla/netwerk/protocol/ftp/src/nsFtpConnectionThread.cpp,v <-- nsFtpConnectionThread.cpp new revision: 1.237.2.3; previous revision: 1.237.2.2 done
Keywords: fixed1.0.0
Verified 05-05 1.0.0 branch. Adding "verified1.0.0" keyword.
Keywords: verified1.0.0
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: