Closed Bug 422908 Opened 16 years ago Closed 16 years ago

consistent "The connection to the server was reset while the page was loading." on crash-stats search

Categories

(Socorro :: General, task, P1)

x86
Windows XP

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 444749

People

(Reporter: wsmwk, Assigned: morgamic)

References

Details

(Keywords: perf, regression)

consistent "The connection to the server was reset while the page was loading." on crash-stats search with "top 10 stack frames contains trash" 

http://crash-stats.mozilla.com/?do_query=1&product=Thunderbird&query_search=stack&query_type=contains&query=trash&date=&range_value=3&range_unit=months
I would recommend shortening the date range, but we should probably have a fulltext index on signature and top 10 frames if we are going to need this to be feasible.  ->0.6
Target Milestone: --- → 0.6
I hit this same bug periodically, but it is not in the case of a crash stats search. I hit it in the office and at wifi access points.
related to bug 423009?
That page doesn't seem to load for me.  Just spins.  Does it normally take a long time for it to load?

I don't know if it's specifically related to the BMO bug or not.  I've not investigated the non-BMO cluster to see if it's exhibiting the same symptoms of Apache rolling over after every request.
Mark, It can take several seconds to a minute or two ..., depending on the query.  However, the issue reported by this bug started same time as bug 423009.

The url in comment 0 still fails with "reset while page was loading"
Morgamic in comment 1 is on to it.  Mossop reports similar (maybe same) issue in Bug 424952 – Queries can be too slow. 

I played this morning with Product=Thunderbird, starts with "JS_strtod". It should return results page in less than a minute or two even for time period of a couple months.  It often fails to return results for 2-3 weeks.

Once you get results page for a time period, if you do another query within a minute or two using the same length time period but with different signature the results come back within a few seconds.  Example:
- get reults of JS_strtod, time period of 4 weeks and thereafter it returns results in 3 seconds.
- change query to zzz, use same 4 weeks, results return in a few sec.
- change query to 5 weeks (with "zzz"), blammo - good luck getting results page

I'm marking regression because I didn't see this problem before a few weeks ago (mid-march maybe?). 

=> Query is not scaling and is pretty much unusable, so maybe it should be sev=critical. Even period=2 weeks can take 3-5 minutes to get results, IF you get results.  what's the next milestone to be pushed out that might fix this?
Keywords: perf, regression
unusable (no results) this morning searching 1 week period for 
  firefox + branch=1.9 + nsDocLoader::QueryInterface + 1 week
adding version to the selection criteria helps get a response

ditto http://crash-stats.mozilla.com/topcrasher/byversion/Thunderbird/3.0a1pre - no response for first couple attempts.

this thing needs a *serious* performance bump.
yeah, it seems to be getting worse.  standard queries that were working fine a few days and weeks ago now are unable to run.

I know ted is back from vacation and morgammic and others have this pretty high on the list to get improved.

we really need it fixed by next week with firefox 3 RC1 gets released so we can stay on top of what happening with incoming crash reports.
I believe this is primarily due to the database partition getting too large. See bug 432448, bug 432449, bug 432450.
It would be good if someone could take a look at this bug so we can use
this server to get a picture of the crash bugs we still have in RC1.
Depends on: 432448, 432449, 432450
aravind,  do you have cycles to help on this?

these bugs are pretty much blocking analysis on firefox 3 rc1 that is going out in the next day or so.

  [See dependency tree for bug 422908]
bug 432448: Add new partition [NEW ; assigned to nobody@mozilla.org; Target: 0.5] 

bug 432449: Add date constraints to older partitions [NEW ; assigned to nobody@mozilla.org; Target: 0.5]

bug 432450: Create script to auto-archive old records and create new partitions when needed [NEW ; assigned to nobody@mozilla.org; Target: 0.5] 

Here's a trick that (at least sometimes) gives me better results:

Say I want to do a search over two weeks of crash records, but that
always times out.

Instead I try the same search over 1 day's worth of records -- which
works.  Then I try it over (say) 3 days, then 7 days, then two weeks
-- all in quick succession, each time waiting for the previous one to
succeed.
(Following up comment #14)

This trick appears to have stopped working :-(
Hi there, can we get some extra traction on this bug?  Its getting worse now and more common to reproduce now.   And this data is important to analyze for qa.  Thanks.
Assignee: nobody → morgamic
Priority: -- → P1
I now can't even get a 1 day span. It's unusable
Severity: major → critical
IIUC, ATM only individual crash reports (such as linked from about:crashes) can be had from the Socorro servers. Everything else fails, and in particular complex queries like the one in comment #18.

See also bug 444749.
(In reply to comment #19)
> IIUC, ATM only individual crash reports (such as linked from about:crashes) can
> be had from the Socorro servers. Everything else fails, and in particular
> complex queries like the one in comment #18.
> 
> See also bug 444749.

The individual crash reports are taking minutes to load for me, raising severity and requesting blocking.
Severity: critical → blocker
Flags: blocking1.9.1?
Blocks: 450485
No longer blocks: 450485
Target Milestone: 0.6 → ---
No longer depends on: 432448, 432449, 432450
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Flags: blocking1.9.1?
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.