Closed Bug 975227 Opened 11 years ago Closed 10 years ago

Report Releng EC2 stats to http://hostedgraphite.com/

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: taras.mozilla, Assigned: dividehex)

References

Details

Attachments

(3 files)

Attached file relay_config.txt (deleted) —
We'd like to evaluate hosted graphite. This is high priority for releng.
Assignee: server-ops → relops
Component: Server Operations → RelOps
Product: mozilla.org → Infrastructure & Operations
QA Contact: shyam → arich
I estimate approx 4 hours to modify puppet/collectd and test the changes. When we are ready to do so, please provide config information provided by the hosting vendor. The API key should be sent offline for security sake.
I'm going to also assume we will need to open netflows to the hosted service from both scl3 and releng aws.
(In reply to Jake Watkins [:dividehex] from comment #1) > I estimate approx 4 hours to modify puppet/collectd and test the changes. > When we are ready to do so, please provide config information provided by > the hosting vendor. The API key should be sent offline for security sake. Sent you the key. Sample config is attached to bug, http://docs.hostedgraphite.com/ for docs
Depends on: 975638
hp4.relabs.releng.scl3.mozilla.com has been kickstarted for testing
(In reply to Jake Watkins [:dividehex] from comment #4) > hp4.relabs.releng.scl3.mozilla.com has been kickstarted for testing i don't see anything being reported
Assignee: relops → jwatkins
This patch puts write_graphite and filters into their own class and templates. It also adds the external graphite hosting as a second graphite write destination. The API key is stored in hiera. I've tested this on hp4 and metrics are being sent to both infra graphite and hostedgraphite
Attachment #8380870 - Flags: review?(dustin)
Comment on attachment 8380870 [details] [diff] [review] bug975227-dup-metrics-ext-graphite.patch Review of attachment 8380870 [details] [diff] [review]: ----------------------------------------------------------------- Looks good with some very minor revisions.. ::: manifests/moco-config.pp @@ +219,5 @@ > + > + if (secret('graphite_apikey_hostedgraphite') == ""){ > + fail("missing graphite_apikey_hostedgraphite") > + } > + $collectd_graphite_hosted_apikey = secret('graphite_apikey_hostedgraphite') It'd be nice if the variable and hiera names were the same @@ +221,5 @@ > + fail("missing graphite_apikey_hostedgraphite") > + } > + $collectd_graphite_hosted_apikey = secret('graphite_apikey_hostedgraphite') > + > + $collectd_write = { I don't see this variable in base.pp (and it should be documented on the wiki, just as a reminder :) ::: modules/collectd/manifests/plugins/filters.pp @@ +11,5 @@ > + > + validate_hash($write_chains) > + > + file { > + "${collectd::settings::plugindir}/${plugin_name}.conf": This should probably just be filters.conf, since the name is hard-coded below in the template ::: modules/collectd/manifests/plugins/write_graphite.pp @@ +11,5 @@ > + > + validate_hash($nodes) > + > + file { > + "${collectd::settings::plugindir}/${plugin_name}.conf": Same here - no sense using ${plugin_name}
Attachment #8380870 - Flags: review?(dustin) → review+
Comment on attachment 8380870 [details] [diff] [review] bug975227-dup-metrics-ext-graphite.patch Checked in and pushed to production w/recommended changes https://hg.mozilla.org/build/puppet/rev/e4f16f789ff4
Attachment #8380870 - Flags: checked-in+
...and of course there was a typo in the write_graphite.conf - SeperateInstances true + SeparateInstances true Since collectd happily ignored the typo, this changed the folder layout and created new db files instead of reporting to the old files. For instance, Original folder layout: drwxr-xr-x 4 carbon carbon 50 Feb 22 01:15 aggregation drwxr-xr-x 5 carbon carbon 58 Feb 22 01:12 df drwxr-xr-x 6 carbon carbon 69 Feb 22 01:23 disk drwxr-xr-x 3 carbon carbon 25 Feb 22 01:15 ethstat drwxr-xr-x 4 carbon carbon 40 Feb 22 01:11 interface drwxr-xr-x 3 carbon carbon 25 Feb 22 01:10 load drwxr-xr-x 3 carbon carbon 137 Feb 24 19:55 memory drwxr-xr-x 4 carbon carbon 4096 Feb 24 19:55 swap drwxr-xr-x 2 carbon carbon 31 Feb 22 01:16 uptime Broken layout: drwxr-xr-x 4 carbon carbon 50 Feb 22 01:15 aggregation drwxr-xr-x 2 carbon carbon 4096 Feb 24 19:59 aggregation-cpu-average drwxr-xr-x 2 carbon carbon 4096 Feb 24 19:58 aggregation-cpu-sum drwxr-xr-x 5 carbon carbon 58 Feb 22 01:12 df drwxr-xr-x 2 carbon carbon 104 Feb 24 19:54 df-boot drwxr-xr-x 2 carbon carbon 104 Feb 24 19:53 df-dev-shm drwxr-xr-x 2 carbon carbon 104 Feb 24 19:55 df-root drwxr-xr-x 6 carbon carbon 69 Feb 22 01:23 disk drwxr-xr-x 6 carbon carbon 93 Feb 24 19:56 disk-sda drwxr-xr-x 6 carbon carbon 93 Feb 24 19:57 disk-sda1 drwxr-xr-x 6 carbon carbon 93 Feb 24 19:56 disk-sda2 drwxr-xr-x 6 carbon carbon 93 Feb 24 19:58 disk-sda3 drwxr-xr-x 3 carbon carbon 25 Feb 22 01:15 ethstat drwxr-xr-x 2 carbon carbon 70 Feb 24 19:54 ethstat-eth0 drwxr-xr-x 4 carbon carbon 40 Feb 22 01:11 interface drwxr-xr-x 5 carbon carbon 71 Feb 24 19:55 interface-eth0 drwxr-xr-x 5 carbon carbon 71 Feb 24 19:55 interface-eth1 drwxr-xr-x 3 carbon carbon 25 Feb 22 01:10 load drwxr-xr-x 3 carbon carbon 137 Feb 24 19:55 memory drwxr-xr-x 4 carbon carbon 4096 Feb 24 19:55 swap drwxr-xr-x 2 carbon carbon 31 Feb 22 01:16 uptime I've patched this but we will need to remove the erroneous folders and new dbs. http://hg.mozilla.org/build/puppet/rev/66a8405f257d
Removes hostedgraphite from the collectd destination list as requested in https://bugzilla.mozilla.org/show_bug.cgi?id=971883#c18
Attachment #8381816 - Flags: review?(dustin)
Attachment #8381816 - Flags: review?(dustin) → review+
releng moved to using diamond in AWS instead.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: