Closed
Bug 778805
Opened 12 years ago
Closed 12 years ago
Verify Zeus thift_check.py properly reports failures to Zeus for socorro staging
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bburton, Assigned: bburton)
References
Details
We're seeing ongoing connectivity issues to Socorro staging's HBase pool, but Zeus is not recording any errors with backend nodes, unlike the Postgres pool.
Work with :tmary to stop HBase on one of the Socorro Staging nodes (hp-node62 - hp-node69) and confirm the Zeus check (https://pp-zlb01.phx.mozilla.net:9090/apps/zxtm/index.fcgi?section=Extra%20Files%3AExternProgMonitors) is working properly.
Assignee | ||
Comment 1•12 years ago
|
||
Per IRC and Zeus logs
20:34:13 tmary | solarce: done
[30/Jul/2012:11:34:25 -0700] WARN monitors/socorro-thrift-check monitorfail Monitor has detected a failure in node '10.8.100.62:9090': Monitor exited, exit code 1, no output generated
[30/Jul/2012:11:34:25 -0700] SERIOUS pools/socorro-thrift-stage:9090 nodes/10.8.100.62:9090 nodefail Node 10.8.100.62 has failed - A monitor has detected a failure
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•12 years ago
|
||
Confirmed happy again, monitor is working
[30/Jul/2012:11:45:56 -0700] INFO monitors/socorro-thrift-check monitorok Monitor is working for node '10.8.100.62:9090'.
[30/Jul/2012:11:45:57 -0700] INFO pools/socorro-thrift-stage:9090 nodes/10.8.100.62:9090 nodeworking Node 10.8.100.62 is working again
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•