Closed Bug 27146 Opened 25 years ago Closed 25 years ago

Need evaluation of Bugzilla perfomance and scalability limits

Categories

(Bugzilla :: Bugzilla-General, defect, P2)

Tracking

()

VERIFIED FIXED
Bugzilla 2.12

People

(Reporter: mitchell, Assigned: endico)

Details

Risto, filing (edited) questions/comments from chofmann as a bug as you requested. __ It would also be good to get an idea about the source of the recent bugzilla performance problems that have been observed by many engineers and testers. If these performance problems are the result of of the system not scaling to current levels of bug entry and query traffic, then this problem could only get worse. Can we get any estimates on the "entry and query transaction rates" that bugzilla can support? Are we anywhere close to those limits? Also, can we get someone to look at performance and scalibility Active Users 90 netscape engineers + 30-40 active mozilla contributor engineers. 35 netscape qa testers + 200-300 active mozilla testers that download the daily builds. -- ~465 active mozilla contributors -- ???? -total number of bugzilla accounts (is this data available?) ---???? -active bugzilla users( a transaction or two in the last month - is this data available and monitored, has it moved much lately)? Causal Users. ~20,000-30,000 full circle build milestone testers ~120,000 milestone testers We need a plan to preserve performance and usability of bugzilla for the roughly 500 very active mozilla contributors.
we also need an evaluation of current peak time usage which seems to be between 2-6pm pst, and what the peak rate limitations might be with the given bugzilla configuration and set up. thanks
I will look into this...being one of 3 important things with Mozilla right now (others being cvs performance and lounge's sendmail configs). I would like to get as much data as possible. Ie. if you can see slowness please report exactly when that happened so I can go back to sar and other logs to look little bit more. We had some issues with bonsai and addcheckin.pl using a lot of resources time to time. So, it's possible that at those times database access was slow. Terry changed addcheckin.pl since then and I'm wondering if things are any better. Also, we were talking about moving bonsai to another database instance. That might help something too.
Status: NEW → ASSIGNED
One logfile that might be helpful: I recently added some logging to Bugzilla, where it logs every SQL request it makes, and also logs when they finish. On lounge, check out /export2/webtools/bugzilla/data/sqllog. (Right now, this file is not automatically rotated, which may be a problem. If you add rotation code, be aware that Bugzilla won't add to the logfile unless it already exists. Any rotating code would have to create a new, empty sqllog file, and give it appropriate permissions.)
I added a script to check response times of bugzilla. It checks tickets of server operations group and after that 10 tickets and I'm logging response times for that. Ok, we have plenty of logging and other data available. Now I just need to see Bugzilla being slow - so, can someone say exactly when they see it slow so we can go back to logs and see what was the cause. Right now mysqld is in heavy load because tree was just opened and addcheckin.pl's are loading it. BUT... I haven't seen it slowing down bugzilla yet. So, if you can give me exactly minutes when bugzilla is slow it would help (unless my test script catches one).
Terry fixed some hook problems with addchekin.pl that were probable cause for mysql/system load (being 30 processes each using 92M memory caused some swapping). I would appreciate feedback of any possible bugzilla slowness you've seen since 2/10/00 7pm PST. My test script has been finishing in 11-15s (it makes about 10 queries to bugzilla, a query for component and some tickets).
So, be aware that some slowness may be occuring just because MySQL may be being asked to do a lot at the same time. For example: suppose person A does a big, complicated, slow query. They are not surprised when their query takes 15 seconds to complete. But during that 15 seconds, persons B, C, and D all try to submit a change to a bug. Now, submitting a change requires getting an exclusive lock on (most of) the database, so those people all have to wait for A's query to finish. They find their change, which might ordinarily take 3 seconds, now takes up to 20 seconds. And then, just as B's checkins starts, person E comes along with the most trivial request, but it takes 9 seconds as they wait for B, C, and D to all finish. This may all seem unlikely and pathological, but I bet similar and much worse things happen fairly often.
OS: Mac System 8.5 → All
Hardware: Macintosh → All
Risto, we'll need to be prepared to give an overview of bugzilla performance, scalability, bottlenecks, etc for jim hamerly's staff, which meets on Tuesday at 11. Dmose has a specific set of questions and more details.
Severity: normal → major
I have now seen few instances when reply time from automatic script has been rather long. Gotta go to logs and see what caused those.
I was just looking at the MySQL manual, and found a section which describes the kind of locking problem I stated above, only worse. This doc can be found at http://www.mysql.org/Manual_chapter/manual_Performance.html#Table_locking . (Note that we are running version 3.22.29, almost the latest stable release; many of the options described in that section only apply to 3.23.xx, which is a development release. I don't think we want to run development releases.) Anyway, to quote that page: > One main problem with this is the following: > > * A client issues a SELECT that takes a long time to run. > * Another client then issues an UPDATE on a used table; This client > will wait until the SELECT is finished > * Another client issues another SELECT statement on the same table; > As UPDATE has higher priority than SELECT, this SELECT will wait > for the UPDATE to finish. It will also wait for the first SELECT > to finish! It's hard to prove, but I bet this happens often enough. Someone does a big, slow, stupid query; while that's happening, someone else changes a bug, suddenly, *everyone* is locked up until the slow stupid query grinds to a finish. We can try some of the fixes outlined in that page. I can hack Bugzilla to always set SQL_LOW_PRIORITY_UPDATES, or (equivilantly) Risto can restart mysqld with the --low-priority-updates flag. This means that changing or creating a bug can sit and block for a really long time, but everything else would behave better. But first, it would be a good idea to prove that this is actually a problem we're hitting.
Here are some times my script has catched when bugzilla replies were slow: Date query started + time to finish: 2/11/00 5.00pm 6:07.3 2/11/00 5.05pm 1:08.3 2/11/00 5.40pm 0:43.6 2/11/00 5.45pm 7:45.8 2/11/00 5.50pm 2:48.4 2/12/00 10.25am 1:18.7 2/12/00 11.55am 1:02.1 2/12/00 1.45pm 0:21.3 2/12/00 3.35pm 1:13.5 2/13/00 6.00pm 1:58.0 2/13/00 10.05pm 4:41.2 2/14/00 9.45am 1:10.0 In normal situation this query finishes in 11 seconds. I will check those times with systems logs to see if we had some performance issues those times.
I can't find any bottlenecks from the system... more later
Ok, I have looked into systems performance and right now I'm rather convinced that we don't have i/o, cpu, memory or other bottlenecks in the system itself. So next steps will be to look into MySQL issues: I have found one segment of problems around 5.42-5.45pm last Friday when my script had long return time. Terry, you might want to take a look at /cvsmirror/tmp/problem.sqllog (had to move sqllogs here because /export2 started to fill up). Look at my comments starting with '#####'. There's one place where insert command is issued to profiles and after that all selects were blocked for long time. Could this be what you describe as a problem? I'm going to bed now and will look more to these later.
The file I mention here is on lounge.
Mysqld running now with --low-priority-insert option. It didn't know --low-priority-updates flag... even if manuals refers to it. Weird.
If you're looking at the URLs I mentioned above, they may be talking about options for the 3.23.xx versions of MySQL, which we're not running. So, I found the culprit in the scenario you described. There is the following line (reformatted here for legibility): 02/11/00 17:42:31 27732: SELECT bugs.bug_id, bugs.groupset, substring(bugs.bug_severity, 1, 3), substring(bugs.priority, 1, 3), substring(bugs.rep_platform, 1, 3), map_assigned_to.login_name, substring(bugs.bug_status,1,4), substring(bugs.resolution,1,4), substring(bugs.short_desc, 1, 60) FROM bugs, profiles map_assigned_to, profiles map_reporter LEFT JOIN profiles map_qa_contact ON bugs.qa_contact = map_qa_contact.userid, longdescs longdescs_ WHERE bugs.assigned_to = map_assigned_to.userid AND bugs.reporter = map_reporter.userid AND bugs.groupset & 0 = bugs.groupset AND longdescs_.bug_id = bugs.bug_id AND (bug_status = 'NEW' OR bug_status = 'ASSIGNED' OR bug_status = 'REOPENED' OR bug_status = 'RESOLVED' OR bug_status = 'VERIFIED' OR bug_status = 'CLOSED') AND (lower(longdescs_.thetext) regexp '(^|[^a-z0-9])window($|[^a-z0-9])' OR lower(longdescs_.thetext) regexp '(^|[^a-z0-9])loads($|[^a-z0-9])' OR lower(longdescs_.thetext) regexp '(^|[^a-z0-9])starts($|[^a-z0-9])' OR lower(longdescs_.thetext) regexp '(^|[^a-z0-9])in($|[^a-z0-9])' OR lower(longdescs_.thetext) regexp '(^|[^a-z0-9])background($|[^a-z0-9])') GROUP BY bugs.bug_id ORDER BY bugs.priority, bugs.bug_severity In English, this translates to "generate the list of all bugs in the system that have a comment containing any of the words "windows", "loads", "starts", or "in". It's not very surprising that it returns every bug in the system, and that it has to look through all 51 megabytes of comment text to do so. So, it's very slow to run, and very slow to finish delivering all the results. Less than a second later, another process (27733) is generating email diffs for a bug, using the new experimental email code. This involves updating a timestamp in a bug. Which means it needs to get a write lock on the bug table, which means it has to wait for the big grody query to finish. 3 seconds later, the process you noticed does its select. It has to wait until the write finishes. This is exactly the scenario I found described in the manual. Now that you have turned on --low-priority-insert, the only people who should see really slow behavior is people making changes to bugs. I think this is an improvement, but it is not really great. I just realized what the right solution may be, but I'm scared of the details of implementing it. Bugzilla should keep two copies of the database around at all times. The main database works exactly as it does now. There is also a shadow database. All changes to the main database get logged in a file. A background process of some kind reads the log and makes the same changes to the shadow database. Then, we change the main query page to do all of its queries against the shadow database, not the main one. Theoretically, these queries might be incorrect as they will be querying old data. Realistically, the shadow database ought to be able to be kept pretty well up-to-date, and everything will work great. I'm pretty sure that BugSplat (Netscape's internal bugsystem) was using this kind of scheme. There's just the small matter of implementing it. Yuck. But at least it's a plan... REASSIGNing back to me, changing product to Webtools, component to Bugzilla, priority to P1.
Assignee: rko → terry
Status: ASSIGNED → NEW
Component: Server Operations → Bugzilla
Priority: P3 → P1
Product: mozilla.org → Webtools
Yeah, I have. The problem is that all the nitfy tricks they talk about only work for INSERT statements. Which is fine, but I need to do a lot of UPDATE statements too, and none of their tricks help there.
Status: NEW → ASSIGNED
Just now, when trying to bring up these two URIs: http://bugzilla.mozilla.org/show_bug.cgi?id=20394 http://bugzilla.mozilla.org/show_bug.cgi?id=27164 ...both stopped for a noticable few seconds (the first trying to get the comments, the second trying to get the top part of the bug). From the above comments I'm guessing this should not have happened, and that you may be able to work out what caused this from the logs.
Whoops! Yes, I'd call a 50-second delay a "noticable few seconds". My theory is that the change Risto makes doesn't affect LOCK TABLES calls. And all of the interesting changes happen while the tables are locked. So, adding --low-priority-insert turns out to be a no-op. (Well, it works when a new bug is created, but not when an old one is edited.) I have hacked the code to put in the LOW_PRIORITY parameter to the LOCK TABLES calls.
You mentioned something in the mail about this, too, but just to check if you were thinking same. How about architecture like this: Main database Mirror database ------------- --------------- All SELECT queries hits this All UPDATE/INSERT queries hits this This side has high priority for This side has writes prioritized selects and penalizes writes (like we used to have) (like we have now) <------------------------------ Sync once a minute; doesn't matter if takes long time or if the main database is little lagged. This might bring more middle of the air collisions but would make both selects and inserts/updates fast.
That's basically the picture I outlined above. I disagree on your nomenclature. To me, the Main database is the up-to-date one that has the real truth in it. And the second database (I called it "shadow") is the one that is read-only which might lag behind the times. In order to "Sync", we apply deltas. That is, we replay all UPDATE/INSERT/etc requests into the shadow database. Rather than once a minute, I think I'll just have an always-running-usually-idle background process try to do them as they happen. And I probably won't bother making every SELECT use the shadow database; just SELECTs that are potentially expensive.
I just now had long delays viewing bugs 13534 and 27146.
My automatic check script catched something too.
Well, damn. I have the bare glimmerings of a clue. In /export2/webtools/bugzilla/data/sqllog, look at the entries for process 15525. (Or, equivilantly, in /opt/mysql-3.22.29/var/lounge.log, look at thread ID 41144.) Both logfiles seem to think that the last thing that this process did was request a LOCK TABLES, at 13:58:59. Neither logfile ever indicates that the LOCK TABLES finished, or that the process ever did anything else. But I think the mysql logfile won't ever indicate it finished, it just indicates when it next gets a legitimate request from that thread. What I think happened was this: The process tried to do a LOCK TABLES. With the new LOW_PRIORITY stuff, it sat and took a long time, because lots of people were busy reading tables and doing queries and stuff. Someone (either the user or the webserver) got bored hanging around and killed the process. But this somehow didn't propagate its way all the way to mysqld, and so mysqld thought this process was still there waiting for a lock. Several minutes later (somewhere around 14:05, I think) mysqld finally managed to honor this LOCK TABLE request. But there was no longer an active process behind the request, and so the thread then just hung around with everything locked. Finally, a couple minutes later (at about 14:07:01), something timed out, the thread was quietly killed, the locked tables were released, and the logjamb of blocked Bugzilla requests were finally unloosed. There is quite a flurry of activity at that time. I have actually seen some other recent evidence that mysqld doesn't notice very quickly when a connection to it is dropped. Maybe that can be fixed. And, if I implement the shadow DB thing, this kind of problem should happen much less frequently, because it won't take that long to get a write lock and people/webservers won't get bored.
Updating commited changes has been *real slow* since about 5pm on 2/16. It is causing great confusion, as it took hours to see changes and the result is bogus "midair collisions" during the time it takes to update. The PDT team will not get prompt notification with this state of affairs.
I have done the shadowing stuff as described above. It still probably needs to be tuned a bit (for example, the logging table is going to grow without bounds until we fill up a disk), but things should be much happier.
Making changes to bug reports still taking 30-60 seconds for Bugzilla to progress beyond displaying: >Bug List: (0 of 652) First Last Prev Next Show list Query page Enter new bug > <HR>
So, 30-60 seconds on changing is not great, and I hope to further tune things to make it better. But I can't consider it a disaster, either. In times of heavy usage, there will always be some delay. But I have reason to hope that we will not be approaching total gridlock like we were before.
Priority: P1 → P2
um. i seem to have the "can confirm bug" bit set in my prefs -> permissions. I can not seem to confirm anything however. why so?
Let's please limit this bug to discussion about performance issues. If you're having other troubles, please open a new bug. (If your other troubles are preventing you from opening a bug, please send me mail.)
Attempting to close 28415 as a duplicate of 20901, it puts the "marked as duplicate" text into 20901, but times out before making any changes to 28415.
Submiting an additional comment to bug 28327 about 6 minutes ago. I am still waiting for it to finish displaying "Bug processed..."
I just submitted a change to bug 28555 (marking it RESOLVED-FIXED). I waited for about a minute (the very top of the response page showed, but not down to the part about sending mail). Then I hit stop and went back to the bug. The changes to the bug were made, but mail was never sent. I usually get bugzilla mail within seconds, so I think (considering it's Sunday morning) that I'm probably not going to get any mail for this change.
I did get the mail after all, but it took about 10 minutes for it to be sent (the message is dated 7:29, the change was made at 7:19). That's a performance problem in itself. I suspect I would have had to wait until 7:29 if I'd wanted to see the page finish loading...
The previous could also have been slowness in mail deliveries; in any of the mta's enroute.
Hi, I saw this interesting thread and thought I would add several comments/ideas/musings/opinions. I've done quite a bit of database/SQL programming but have not used MySQL, so some of these comments are more general. From what I've gathered from scanning the MySQL documentation is that it only supports table locks, so the main improvement that can be done (after the addition of the queued inserts/updates with the main-mirror database) is to make the SELECT query as fast as possible. One way of doing this is could be accomplished by adding an option to the http://bugzilla.mozilla.org/query.cgi page to allow users to limit the # of results returned by their queries. A selection box that allows limits of say 25, 50, 100 or unlimited number of results. There is a LIMIT option for SELECT statements in MySQL which I think will let you accomplish this: http://mysql.bluep.com/Manual_chapter/manual_Performance.html#LIMIT_optimization You could default to 50, which for many may be adequate, and the 'unlimited' option would keep the power users happy. For example suppose I do a query on description using the word 'clipped' which unbounded yields 1000 rows. By using a LIMIT of 100, only 10% of the table needs to be read (assuming an even distribution of the word 'clipped'). This also has a side benefit of limiting how much HTML the web server has to spit out. You could also do a full-text index on the longdescs.thetext field. Build a table of unique words called say 'unique_words' that has an field 'word'. This table will have a row for each unique word across all descriptions. Build a many to many table between it and the 'longdescs' table, indexing 'word' and the foreign key fields appropriately. Your select statement can be recoded so that queries that are searching for words in the description entry can utilize the indexed unique_words table to find data much faster. The downside of this is that inserts/updates would be slower because of breaking up the longdescs, however you could do the inserts/updates as you normally do them now and have a secondary process break up the newly added descriptions hourly. Some databases I've used support a READ UNCOMMITTED transaction isolation mode which allows SELECT statements to ignore locks and just read the data even if another thread has a write lock. MySQL doesn't seem to support this (or transactions), but long term perhaps you can ask them to add a read uncommitted or 'dirty' read type feature. You could consider using another DBMS (not a flame!). From what I see of MySQL it doesn't support page or row level locking, which would really help with the types of problems you are having, and may eliminate the need for a second copy of the database. I believe PostgreSQL has page locking and READ UNCOMMITTED. (Haven't used it either). Some of the commercial products have built-in text indexing features. I know this is less of an option since a lot of porting work would be required (bug 1104).
Optimizing the performance of SELECTs is not my highest priority right now. They seem to work pretty well. Things can always be made better, but I don't think it's bad right now. I'm very surprised to hear about 6-minute delays. I don't know what is causing them. I can understand that it's theoretically possible, but I would never expect it to happen. Which probably just means that I don't know what's going on. The biggest problem that I know of is that not only does MySQL do table-based locking, but I force it to lock down most of the tables all-at-once when you update a bug. This is because I'm super-paranoid about two competing processes causing inconsistancies by doing changes simultaneously. I am toying with the idea of fixing this by simulating record-based locks using the MySQL GET_LOCK() function. I'm not sure whether I'll be able to pull this off, nor am I sure how it will effect performance.
SETUP caused an invalid page fault in module XPCOM.DLL at 015f:60c580b8. Registers: EAX=01320900 CS=015f EIP=60c580b8 EFLGS=00010246 EBX=00000000 SS=0167 ESP=006779e4 EBP=006779f0 ECX=60c6c80c DS=0167 ESI=78010c8e FS=3707 EDX=00000003 ES=0167 EDI=01324b20 GS=0000 Bytes at CS:EIP: 83 23 00 6a 26 68 60 3f c8 60 c7 00 01 00 00 00 Stack dump: 80000000 01320900 00000000 00677a3c 60c42948 01324b08 60c83844 01320900 00000000 60c53ea7 00000000 60c454a2 013208b0 60c45513 013208e0 013208b0
I find that whenever I run a saved Query, it just takes forever to get the query. It is to the point where it is just faster to put in the query manuly.
marking fixed for Seth. Thanks a lot, Seth!
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Um, what? David, I think you just closed the wrong bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
sorry, I was running 5.0 - I must have got confused about which page I was looking at.
I have been having a performance problem with mozilla and maybe this is the bug to place it in. The problem I notice is that the layout processing can take so much of the CPU that network activity is starved. For instance, if I go to Slashdot there is an initial delay while the HTML page downloads. During this time the statusbar and the throbber are active. Then, once it has enough info to attempt layout the status bar animation and the throbber halt in midstream and the network activity stops at the same time. After a few seconds (depending on page complexity and speed of my machine) mozilla presents the layout and then begins downloading inlined images. Now, IMHO this is definately bad behaviour. For those of us on dialup connections, downloading images is the top delay in web browsing. The network code should never be starved of processing time. The layout should be processed a little slower rather than delaying download of the inline images for 2-3 seconds. My $0.02
No, this bug is for recording problems in *Bugzilla*, not in mozilla itself.
I think I might have been cc'd by mistake, but before I go hide again, I'll offer what little advice I can even though it might be obvious. (I don't know MySQL, so I can only make really generic observations.) Since contention is the main scaling problem with concurrent usage, the only good approach I know is to reduce lock granularity size when possible, and to use more locks when this separates things into non-interfering spaces. For big databases under heavy loads, I think there's a classical solution to reduce contention in circumstances when one wants to globally lock everything. And this is to shorten the duration of the global lock by using it only to guard transitions to smaller granularity locks in a tree structure. To lock child C under parent P, one might lock P before C and release P while actually using C. (The partial ordering in lock sequence prevents deadlocks.) Database literature gets really hairy about the fine details of writing, intending, and sharing style locks, and transitions between these. But you can just ignore that and aim to reduce contention by ad hoc means, just by thinking about the problem in general terms. Sorry if all this is obvious. I'm just trying to be helpful.
I might also point out that Dave Rothschild says in staff meetings that folks are now spending hours a day groveling over buglists to fine tune triage for beta. So the load might be more than the initial scenario projects.
data point: I am having wretched performance problems today. Simply clicking on a link to a bug is taking upwards of a minute to respond.
It is believed that today's problems were due to some network problems, not problems with the Bugzilla code itself.
Allright, stupid question. How do I get my email off the CC for this bug (my contribution was a late night boob mistake)? There are so many on the cc list it doesn't display my email address in the truncated list.
it becomes really slow when visiting http://zicon.stjernesludd.net/passionate
Thought y'all might want to know - mysql has some built in facilities for mirroring - one of the startup options will generate a trace of exact sql updates/inserts against the database into a log file. You can then run this log file on the secondary db server to roll it forward. Here is the relevant section from the mysql manual: ------------ The update log When started with the --log-update=file_name option, mysqld writes a log file containing all SQL commands that update data. The file is written in the data directory and has a name of file_name.#, where # is a number that is incremented each time you execute mysqladmin refresh or mysqladmin flush-logs, the FLUSH LOGS statement, or restart the server. If you use the --log or -l options, mysqld writes a general log with a filename of `hostname.log', and restarts and refreshes do not cause a new log file to be generated (although it is closed and reopened). By default, the mysql.server script starts the MySQL server with the -l option. If you need better performance when you start using MySQL in a production environment, you can remove the -l option from mysql.server. Update logging is smart since it logs only statements that really update data. So an UPDATE or a DELETE with a WHERE that finds no rows is not written to the log. It even skips UPDATE statements that set a column to the value it already has. If you want to update a database from update log files, you could do the following (assuming your update logs have names of the form `file_name.#'): shell> ls -1 -t -r file_name.[0-9]* | xargs cat | mysql ls is used to get all the log files in the right order. This can be useful if you have to revert to backup files after a crash and you want to redo the updates that occurred between the time of the backup and the crash. You can also use the update logs when you have a mirrored database on another host and you want to replicate the changes that have been made to the master database. -------------------- Hope this is useful.
I'm simply wondering if anyone else is getting intermittent e-mails supposedly from 1 bug either empty, or with only a last sentence or so of what appears to be a paragraph. I've had a few of these now purporting to be from certain bugs with many posts in them, yet visiting these bugs the mysterious chunk of comment in the e-mail is not present. I can dig through my e-mails to pull out some specific examples if necessary. I find it hard to believe I'd be the only one this is happenning to though.
I haven't seen anything like that; please open a new bug with all possible details. And *PLEASE*, people, only put things about BUGZILLA PERFORMANCE PROBLEMS into this bug (bug 27146)!
Terry: I just got this when reassigning a bug: ----------------------------------------------------------- Mid-air collision detected! Someone else has made changes to this bug at the same time you were trying to. The changes made were: Who What Old value New value When Content-type: text/html Software error: SELECT attach_id FROM attachments WHERE bug_id = 30385: Table 'attachments' was not locked with LOCK TABLES at globals.pl line 134. Please send mail to this site's webmaster for help. --------------------------- so here it is.
Whoops. Fixed. (But why oh why do people insist on reporting non-performance related things in this bug which is supposed to be only for BUGZILLA PERFORMANCE PROBLEMS ???)
Adding an email string between mitchell and Rickg. Rick Gessner wrote: The bugzilla site has become a performance bottleneck. I'm wondering if we have any plans to add hardware to improve this. Mitchell Baker wrote: Rick last time we looked at this, Risto believed that hardware was not the problem. (I'm planning to add some anyway as a preventive measure as things heat up going forward.) Last time Risto was able to track down performance problems; to do so he needed specific data so he could check logs, etc. As you generate specific data, please add it to bug 27146 . That will allow us to look into other potential problems as well. mitchell Rick Gessner wrote: So what do you need in terms of data? Should I make queries and run a stopwatch? Instinctively I know it's a problem because we're all spending more time waiting for buzilla to respond. Rick I'll let risto answer definitively, as to the types of data that are needed to track network performance; i/o limits, database performance, etc. But useful info includes: whether your making a query or updating; what types of queries whether this is a constand "this seems slower" or periodic instances where something is really, really, slow, etc.
I wanted to put it down here again: I don't believe Bugzilla problem is much to do with system performance; it's more of bugzilla architecture issue. On occasions Bugzilla is slow I haven't seen bottlenecks in the system. It's pretty much same thing as 3 lane freeway that is blocked due to maintenance. It doesn't go any faster if the freeway is widened to 5 lanes. The block might be gone faster if disk i/o would be faster but the main problem IMHO is that bugzilla don't scale and we can't forever add faster hardware as we get more bugs. If anyone have exact times when bugzilla has been slow please give me exact times so I can compare system logs again.
Since you wanted exact times: Right now (as I type this) I am waiting for a change that I entered into bug 27999 to be submitted. I've been waiting for the "Bug Processed" page for a good 2 or 3 minutes so far. It's 2000-03-16 12:08 PST.
For the record, the email I got about my above comment bug 27999 was dated 12:15:31 (on lounge.mozilla.org), but the comment in the bug was listed as 12:03.
tara@tequilarista.org is the new owner of Bugzilla and Bonsai. (For details, see my posting in netscape.public.mozilla.webtools, news://news.mozilla.org/38F5D90D.F40E8C1A%40geocast.com .)
Assignee: terry → tara
Status: REOPENED → NEW
things to look at to gives hints as to what is going on: mysqladmin processlist | grep -v Sleep to see what is locked and because of what. Then using "EXPLAIN SELECT ..." on some of the queries taking forever and then some can shed some light on how they could be optimized. But as Terry (I think) mentioned, if it needs to look through everything then it is going to take a while no matter what. Tweaking the various mysqld parameters for buffer sizes etc can often give big improvements too.
It's often slow between 12:20 and 12:30 am, west coast us (pacific) time.
20 0 * * * cd /export2/mysqlbackup ; ./grabbackups >> log 2>&1
You know what? This is more a mozilla.org thing. I mean, I'm all for scalability, but I'm not going to worry about doing benchmarking for mozilla.org at this point. I'm gonna reassign this to endico so she can figure out whether or not to continue caring.
Assignee: tara → endico
This seems to be pretty throughly evaluated and the problems we were having have been mostly fixed. Closing this rambling bug.
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
bbbbbooooooooo!
Verifying for endico.
Status: RESOLVED → VERIFIED
In search of accurate queries.... (sorry for the spam)
Target Milestone: --- → Bugzilla 2.12
Blocks: 71861
No longer blocks: 71861
Moving closed bugs to Bugzilla product
Component: Bugzilla → Bugzilla-General
Product: Webtools → Bugzilla
QA Contact: endico → matty
Version: other → unspecified
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.