Closed Bug 976415 Opened 11 years ago Closed 9 years ago

Make AWS node type available to graphite & build metadata

Categories

(Release Engineering :: General, defect)

x86_64
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: taras.mozilla, Assigned: catlee)

References

Details

Attachments

(3 files, 1 obsolete file)

This will make diagnosing issues m1.medium to m3.medium, tracking perf via graphite and figuring out random speedups/slowdowns* easier. http://glandium.org/blog/?p=3201
Not really practical ATM since we assume that hostname == buildbot slave name, and I don't think we want to pre-allocate all possible instance type / slave # combinations in buildbot. We've been discussing ways to break the hostname == buildbot slave name requirement, so maybe wait for that? Or, is there another way we could make this data available to make data analysis possible?
We could fake it in graphite. Still need an easy way to get into build logs, etc. Could just set the machine-local hostname in /etc/hostname?
The important things are getting the instance type into graphite, and also into the build metadata and logs. The instance id should also go into the build metadata and logs.
Assignee: nobody → catlee
OS: Windows 8.1 → All
Summary: Put AWS node type into hostname → Make AWS node type available to graphite & build metadata
A few pieces here: - a script to grab metadata from AWS's service and dump it into /etc/instance_metadata.json. - an init service to make sure ^^ is run on boot - a diamond collector to submit the instance type to graphite I'll be reading the instance_metadata.json file into buildbot properties as well.
Attachment #8385451 - Flags: review?(rail)
Comment on attachment 8385451 [details] [diff] [review] get instance metadata and submit some of it to graphite Review of attachment 8385451 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/instance_metadata/files/InstanceMetadataCollector.conf @@ +1,3 @@ > +enabled=True > +interval=600 > +path=instance_metadata No idea about the format, but it looks good. :) ::: modules/instance_metadata/files/instance_metadata.initd @@ +26,5 @@ > +DESC="instance_metadata" > + > +CMD=/usr/local/bin/instance_metadata.py > +OUTPUT=/etc/instance_metadata.json > +PYTHON=/tools/python27/bin/python Can you use ${packages::mozilla::python27::python} here so it doesn't bite us if we decide to upgrade?
interdiff: diff --git a/modules/instance_metadata/manifests/init.pp b/modules/instance_metadata/manifests/init.pp index 2ef9418..871dfb8 100644 --- a/modules/instance_metadata/manifests/init.pp +++ b/modules/instance_metadata/manifests/init.pp @@ -31,7 +31,7 @@ class instance_metadata { file { "/etc/init.d/instance_metadata": require => File["/usr/local/bin/instance_metadata.py"], - source => "puppet:///modules/instance_metadata/instance_metadata.initd", + content => template("instance_metadata/instance_metadata.initd.erb"), mode => 0755, owner => "root", notify => Service["instance_metadata"]; diff --git a/modules/instance_metadata/files/instance_metadata.initd b/modules/instance_metadata/templates/instance_metadata.initd.erb similarity index 95% rename from modules/instance_metadata/files/instance_metadata.initd rename to modules/instance_metadata/templates/instance_metadata.initd.erb index 71484cd..0acb52b 100644 --- a/modules/instance_metadata/files/instance_metadata.initd +++ b/modules/instance_metadata/templates/instance_metadata.initd.erb @@ -27,7 +27,7 @@ DESC="instance_metadata" CMD=/usr/local/bin/instance_metadata.py OUTPUT=/etc/instance_metadata.json -PYTHON=/tools/python27/bin/python +PYTHON=<%= scope.lookupvar('::packages::mozilla::python27::python') %> test -x ${CMD} || exit 0
Attachment #8385451 - Attachment is obsolete: true
Attachment #8385451 - Flags: review?(rail)
Attachment #8385596 - Flags: review?(rail)
Attachment #8385596 - Flags: review?(rail) → review+
Attachment #8385596 - Flags: checked-in+
puppet patch in production
This should eventually be moved from node definitions to toplevel classes. Do you have an idea which toplevel class it will move to? What blocks moving it there now?
I think ideally it's on all nodes, and instance_metadata is. We don't have diamond packaged for all our nodes, which is why they're for specific node types. John, do you recall why you put the diamond include at the node definition rather than in toplevel::slave somewhere?
Flags: needinfo?(jhopkins)
Ah, I missed that instance_metadata is in toplevel. diamond is only temporarily in the node defs until it's implemented everywhere. Thanks for the explanation!
Flags: needinfo?(jhopkins)
Can this information be added to build logs by buildbot? (Should I file a separate bug for this)
Yes, I'm working on that as well
Pushed https://hg.mozilla.org/build/puppet/rev/23f1c183d846 to make sure the instance metadata is readable (it's mode 0600 root:root ATM)
Ubuntu nodes are running facter-1.7.5 now, so you should be able to get most of this data (and more) from facter now.
something here is in production
Attachment #8389952 - Flags: review?(bhearsum)
Attached patch call metadata script (deleted) — Splinter Review
Attachment #8389953 - Flags: review?(bhearsum)
Attachment #8389953 - Flags: review?(bhearsum) → review+
Attachment #8389952 - Flags: review?(bhearsum) → review+
Attachment #8389952 - Flags: checked-in+
Attachment #8389953 - Flags: checked-in+
Depends on: 983742
in production: bug 976415 - don't flunk on failure bug 976415 - Make sure we can read the file before reading it
Some builds now have the AWS instance data in the logs, e.g. https://tbpl.mozilla.org/php/getParsedLog.php?id=36162353&tree=Mozilla-Inbound&full=1 ========= Started set props: aws_ami_id aws_instance_id aws_instance_type (results: 0, elapsed: 0 secs) (at 2014-03-14 13:21:51.212797) ========= python tools/buildfarm/maintenance/get_instance_metadata.py in dir /builds/slave/m-in-l64-000000000000000000000/. (timeout 1200 secs) watching logfiles {} argv: ['python', 'tools/buildfarm/maintenance/get_instance_metadata.py'] environment: CCACHE_HASHDIR= CVS_RSH=ssh G_BROKEN_FILENAMES=1 HISTCONTROL=ignoredups HISTSIZE=1000 HOME=/home/cltbld HOSTNAME=bld-linux64-ec2-314.build.releng.usw2.mozilla.com LANG=en_US.UTF-8 LESSOPEN=|/usr/bin/lesspipe.sh %s LOGNAME=cltbld MAIL=/var/spool/mail/cltbld PATH=/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin PWD=/builds/slave/m-in-l64-000000000000000000000 SHELL=/bin/bash SHLVL=1 TERM=linux TMOUT=86400 USER=cltbld _=/tools/buildbot/bin/python using PTY: False {"aws_ami_id": "ami-6eea8b5e", "aws_instance_id": "i-45a9394c", "aws_instance_type": "c3.xlarge"} program finished with exit code 0 elapsedTime=0.014326 aws_ami_id: u'ami-6eea8b5e' aws_instance_id: u'i-45a9394c' aws_instance_type: u'c3.xlarge' These are also set as properties which will be accessible via the build status json, etc.
Is this diamond information still useful? These classes were left in the node definitions "temporarily", and they cause instance_metadata to be installed, but now we want to run that from runner. I'll remove them for the moment in bug 1046926.
I think the diamond info is still useful, yes. How do you recommend we proceed?
I didn't end up changing the node definitions, but we should. I think that the 'include diamond' should get moved to the appropriate buildslave toplevel classes, and the instance_metadata specific bits moved to diamond::instance_metadata.
I think diamond has been killed.
Status: NEW → RESOLVED
Closed: 9 years ago
QA Contact: mshal
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: