Closed
Bug 977611
Opened 11 years ago
Closed 11 years ago
Use local storage for builds
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Unassigned)
References
Details
Attachments
(5 files, 10 obsolete files)
(deleted),
patch
|
rail
:
review+
rail
:
feedback+
massimo
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
catlee
:
review+
massimo
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
catlee
:
review+
massimo
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
catlee
:
review+
massimo
:
checked-in+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
catlee
:
review+
massimo
:
checked-in+
|
Details | Diff | Splinter Review |
With only a handful of builders per slave, we should be able to use the local SSD drives for builds that are attached to most EC2 instances. This will be awesome for lots of reasons:
- Less waiting for EBS -> faster builds! -> less $$$
- Less IO to EBS -> less $$$
- We can reduce the disk size of our EBS volumes -> less $$$
Comment 1•11 years ago
|
||
To make this happen we need to teach lvm-init how to detect available ephemeral storage, so we can use different instance types without worrying about their lvm configs.
We should also rethink how we generate AMIs for spot builders. ATM we clone one of the on-demand instances without any modifications like disk size or ephemeral storage configuration.
Comment 2•11 years ago
|
||
Hi Chris,
this patch (is based on you scripts), adds jacuzzi_metadata script and manage_instance_storage.py.
jacuzzi metadata downloads locally the content of the http://jacuzzi-allocator.pub.build.mozilla.org/v1/machines/<hostname> at boot and manage_instance_storage.py, uses the content of that file to decide where to mount the instance storage (if any)
The jacuzzi metadata script is executed at boot time, before the manage instance script.
I am still working on the manage instance script to find out all the possible scenarios and all the possible edge cases. In case you need it, dev-bld-linux64-ec2-catlee has a the scripts already checked out and it's ready for testing
Attachment #8398527 -
Flags: feedback?(catlee)
Comment 3•11 years ago
|
||
Comment on attachment 8398527 [details] [diff] [review]
[puppet] Bug 977611 - manage instance storage script.patch
Review of attachment 8398527 [details] [diff] [review]:
-----------------------------------------------------------------
Woot! I looked at manage_instance_storage.py only and have 2 minor comments:
::: modules/aws/files/manage_instance_storage.py
@@ +107,5 @@
> + need_format = True
> + output = get_output_form_cmd(cmd=blkid_cmd, raise_on_error=False)
> + if output:
> + for line in output.splitlines():
> + if 'ID_FS_TYPE=ext4' in line:
IIRC, by default they come formatted as ext3. It's worth to check for both ext3 and ext4
@@ +189,5 @@
> + # just in case...
> + # log fstab content before updating it
> + log.debug(read_fstab())
> + old_fstab = read_fstab()
> + with open('/etc/fstab', 'w') as out_fstab:
I'd rather write to a temporary file and rename it later.
Attachment #8398527 -
Flags: feedback+
Comment 4•11 years ago
|
||
Added requested fixes:
* skipping formatting if the disk is already ext3 or ext4
* using a temporaty file to modify /etc/fstab rather than editing it in place
Attachment #8398527 -
Attachment is obsolete: true
Attachment #8398527 -
Flags: feedback?(catlee)
Attachment #8398580 -
Flags: feedback?(rail)
Reporter | ||
Comment 5•11 years ago
|
||
Comment on attachment 8398580 [details] [diff] [review]
[puppet] Bug 977611 - manage instance storage script II.patch
Review of attachment 8398580 [details] [diff] [review]:
-----------------------------------------------------------------
::: modules/aws/files/manage_instance_storage.py
@@ +53,5 @@
> + raise
> + return False
> +
> +
> +def get_output_form_cmd(cmd, cwd=None, raise_on_error=True):
nit: s/_form_/_from_/
@@ +154,5 @@
> +
> +def fstab_line(device):
> + """check if device is in fstab"""
> + is_fstab_line = False
> + for line in read_fstab():
should probably make this skip lines beginning with #
@@ +203,5 @@
> +
> +
> +def my_name():
> + import socket
> + return socket.gethostname().partition('.')[0]
is this used, or leftover from debugging?
@@ +214,5 @@
> + _mount_point = '/mnt/instance_storage'
> + try:
> + with open(jacuzzi_metadata_file) as data_file:
> + if json.load(data_file):
> + # hey I am a Jacuzzi!
can you add a check here that # of builders in the jacuzzi is <= 3?
Attachment #8398580 -
Flags: feedback+
Comment 6•11 years ago
|
||
Comment on attachment 8398580 [details] [diff] [review]
[puppet] Bug 977611 - manage instance storage script II.patch
Review of attachment 8398580 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm
don't forget that manage_instance_storage.py may conflict with lvm-init from aws_watch_pending.py (for spot instances) and aws_create_instance.py (for on-demand instances). We should prep a deployment plan for this.
Attachment #8398580 -
Flags: feedback?(rail) → feedback+
Comment 7•11 years ago
|
||
(In reply to rail from comment #6)
> Comment on attachment 8398580 [details] [diff] [review]
> [puppet] Bug 977611 - manage instance storage script II.patch
>
> Review of attachment 8398580 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> lgtm
>
> don't forget that manage_instance_storage.py may conflict with lvm-init from
> aws_watch_pending.py (for spot instances) and aws_create_instance.py (for
> on-demand instances). We should prep a deployment plan for this.
If you delete /etc/lvm-init/lvm-init.json as a part of this patch this should be enough to prevent lvm-init working on boot.
Comment 8•11 years ago
|
||
Thanks for the feedback!
This patch fixes the errors you have found and adds the get_builders_from() function to manage new problems found with broken/bad/not existing jacuzzi_metadata.json file
Attachment #8398580 -
Attachment is obsolete: true
Attachment #8399498 -
Flags: feedback?
Comment 9•11 years ago
|
||
(In reply to rail from comment #7)
> If you delete /etc/lvm-init/lvm-init.json as a part of this patch this
> should be enough to prevent lvm-init working on boot.
Rail, last patch does not change anything about lvm-init, I am trying to understand if it is possible to remove lvm-init completely
Comment 10•11 years ago
|
||
(In reply to Massimo Gervasini [:mgerva] from comment #9)
> (In reply to rail from comment #7)
> > If you delete /etc/lvm-init/lvm-init.json as a part of this patch this
> > should be enough to prevent lvm-init working on boot.
>
> Rail, last patch does not change anything about lvm-init, I am trying to
> understand if it is possible to remove lvm-init completely
yup, you can do this
1) rm the file I mentioned (puppet)
2) for centos make sure that lvm-init is not absent (puppet)
3) for ubuntu make sure that /sbin/lvm-init is absent (puppet)
Once these are landed:
4) delete lvm-init related code from aws_create_instance.py and scripts/aws_watch_pending.py
Comment 12•11 years ago
|
||
Comment on attachment 8400035 [details] [diff] [review]
[cloud-tools] remove lvm-init configuration files.patch
lgtm
Attachment #8400035 -
Flags: feedback?(rail) → feedback+
Comment 13•11 years ago
|
||
This patch:
* removes /etc/lvm-init/lvm-init.json
* removes lvm-init package on CentOS
* remove /sbin/lvm-init on Ubuntu
Attachment #8400053 -
Flags: feedback?(rail)
Comment 14•11 years ago
|
||
Comment on attachment 8400053 [details] [diff] [review]
[puppet] instance_storage.pp.patch
Review of attachment 8400053 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm.
BTW, we should remove the last block (removing lvm-init) when wea redone with that.
Attachment #8400053 -
Flags: feedback?(rail) → feedback+
Comment 15•11 years ago
|
||
Updated jacuzzi mount point to /builds/slave.
We already have hg/git shared directories in /builds, if we mount the instance storage there, we have to clone the shared directories again.
Attachment #8400674 -
Flags: feedback?(rail)
Updated•11 years ago
|
Attachment #8400674 -
Flags: feedback?(rail) → feedback+
Comment 16•11 years ago
|
||
Hi catlee,
this patch contains only two fixes:
* added:
require => Class['packages::mozilla::python27']
in modules/jacuzzi_metadata/manifests/init.pp otherwise /usr/local/bin/jacuzzi_metadata.py fails its first run because python27 is not yet available in /tools
* fixed:
wrong JACUZZI_METADATA_FILE path in manage_instance_storage.py
Attachment #8399498 -
Attachment is obsolete: true
Attachment #8400053 -
Attachment is obsolete: true
Attachment #8400674 -
Attachment is obsolete: true
Attachment #8399498 -
Flags: feedback?
Attachment #8401276 -
Flags: review?(catlee)
Reporter | ||
Comment 17•11 years ago
|
||
Comment on attachment 8401276 [details] [diff] [review]
[puppet] Bug 977611 - use local storage for builds.patch
Review of attachment 8401276 [details] [diff] [review]:
-----------------------------------------------------------------
::: modules/aws/manifests/instance_storage.pp
@@ +27,5 @@
> + enable => true;
> + }
> + file {
> + "/etc/lvm-init/lvm-init.json":
> + ensure => absent;
nit: indent this one more level. same below
Attachment #8401276 -
Flags: review?(catlee) → review+
Comment 18•11 years ago
|
||
Hi catlee,
in this patch:
* fixed indentation
* updated fstab line to have dump = 0 and fsck = 0
* updated string formatting
Attachment #8401276 -
Attachment is obsolete: true
Attachment #8401442 -
Flags: review?(catlee)
Reporter | ||
Comment 19•11 years ago
|
||
Comment on attachment 8401442 [details] [diff] [review]
[puppet] Bug 977611 - use local storage for builds.patch
Review of attachment 8401442 [details] [diff] [review]:
-----------------------------------------------------------------
shipit!
_~
_~ )_)_~
)_))_))_)
_!__!__!_
\______t/
~~~~~~~~~~~~~
Attachment #8401442 -
Flags: review?(catlee) → review+
Updated•11 years ago
|
Attachment #8401442 -
Flags: checked-in+
Comment 20•11 years ago
|
||
now try machines mount instance storage disk(s) under /builds/slave
Attachment #8401483 -
Flags: review?(catlee)
Comment 21•11 years ago
|
||
/etc/slave-trustlevel is a single line, no need for lists here
Attachment #8401483 -
Attachment is obsolete: true
Attachment #8401483 -
Flags: review?(catlee)
Attachment #8401504 -
Flags: review?(catlee)
Reporter | ||
Updated•11 years ago
|
Attachment #8401504 -
Flags: review?(catlee) → review+
Updated•11 years ago
|
Attachment #8401504 -
Flags: checked-in+
Reporter | ||
Comment 22•11 years ago
|
||
in production
Comment 23•11 years ago
|
||
Hi caltee,
this new patch takes care of unmounting the logical volume if it is already mounted, it also takes care of unmounting the swap file (if any)
I did not find any easy method to map the output of vgs -o lv_path to an entry in fstab, so the code looks messy. I'm happy to nuke that part of the code if there is a better way to do it.
Attachment #8401963 -
Flags: feedback?(catlee)
Comment 24•11 years ago
|
||
Hi Chris,
here's slightly better version of the previous patch for volumes already mounted/swap files in use.
Changes:
* removed redundant checks
* updated logging messages
* minor fixes
Attachment #8401963 -
Attachment is obsolete: true
Attachment #8401963 -
Flags: feedback?(catlee)
Attachment #8402579 -
Flags: review?(catlee)
Reporter | ||
Comment 25•11 years ago
|
||
Comment on attachment 8402579 [details] [diff] [review]
[puppet] Bug 977611 - unmount already mounted vg and swapfile.patch
Review of attachment 8402579 [details] [diff] [review]:
-----------------------------------------------------------------
just a few tweaks required, then we can land it I think!
::: modules/aws/files/manage_instance_storage.py
@@ +148,5 @@
> + # vgs command failed, no volume groups
> + log.debug('there are no volume groups')
> + except IndexError:
> + log.debug('No volume groups found')
> + return token
this will return either the raw input from 'vgs -o <token>' in the case no volume groups are found, or the original token name in the case there are no volume groups.
perhaps move the 'return token' inside the try: clause, and return None after the except clauses?
@@ +246,5 @@
> + for line in read_fstab():
> + if old_fstab_line not in line:
> + out_fstab.write(line)
> + log.debug('removed %s from /etc/fstab', old_fstab_line.strip())
> + os.rename(temp_fstab.name, '/etc/fstab')
wrap this in a try/except block where you delete temp_fstab.name in the case of unhandled exceptions.
Attachment #8402579 -
Flags: review?(catlee) → review-
Comment 26•11 years ago
|
||
Hey catlee,
thanks for the review, this new patch adds a better exception handling for _query_vgs() and os.rename() calls. Moved '/etc/fstab' into a constant, updated log messages
Attachment #8402579 -
Attachment is obsolete: true
Attachment #8402691 -
Flags: review?
Reporter | ||
Comment 27•11 years ago
|
||
Comment on attachment 8402691 [details] [diff] [review]
[puppet] Bug 977611 - unmount already mounted vg and swapfile.patch
Review of attachment 8402691 [details] [diff] [review]:
-----------------------------------------------------------------
::: modules/aws/files/manage_instance_storage.py
@@ +243,5 @@
> + temp_fstab = tempfile.NamedTemporaryFile(delete=False)
> + with open(temp_fstab.name, 'w') as out_fstab:
> + for line in read_fstab():
> + if old_fstab_line not in line:
> + out_fstab.write(line)
put this whole 'with' block into the 'try' block.
Attachment #8402691 -
Flags: review? → review+
Updated•11 years ago
|
Attachment #8402691 -
Flags: checked-in+
Updated•11 years ago
|
Attachment #8400035 -
Flags: review?(rail)
Updated•11 years ago
|
Attachment #8400035 -
Flags: review?(rail) → review+
Updated•11 years ago
|
Attachment #8400035 -
Flags: checked-in+
Comment 28•11 years ago
|
||
Hi Chris,
changes in this patch:
* if the instance storage has enough space, mount it in /builds/slaves - enough space is set 20 GB, how big should it be? 100? more?
* updated the pvcreate() so it manages the case where the instance type changes from one to multiple ephemeral devices (e.g m3.large -> c3.xlarge)
* update log level to INFO and cleaned up log messages
* some refactoring
Attachment #8404152 -
Flags: feedback?(catlee)
Reporter | ||
Comment 29•11 years ago
|
||
Comment on attachment 8404152 [details] [diff] [review]
[puppet] Bug 977611 - use instance storage if it has enough space.patch
Review of attachment 8404152 [details] [diff] [review]:
-----------------------------------------------------------------
::: modules/aws/files/manage_instance_storage.py
@@ +11,4 @@
>
>
> log = logging.getLogger(__name__)
> +log.setLevel(logging.INFO)
what's the purpose of this? We're already setting the log level in main()
@@ +17,5 @@
> DEFAULT_MOUNT_POINT = '/mnt/instance_storage'
> JACUZZI_MOUNT_POINT = '/builds/slave'
> JACUZZI_METADATA_FILE = '/etc/jacuzzi_metadata.json'
> ETC_FSTAB = '/etc/fstab'
> +REQ_BUILDS_SIZE = 20 # size in GB
I think 120 GB is a good starting point. 20 is definitely too small.
@@ +358,5 @@
> + # test if device has enough space, if so mount the disk
> + # in JACUZZI_MOUNT_POINT regardless the type of machine
> + # assumption here: there's only one volume group
> + if vg_size() >= REQ_BUILDS_SIZE:
> + log.info('disk size: >= REQ_BUILDS_SIZE (%d GB)', REQ_BUILDS_SIZE)
would be nice to log what the calculated disk size is here. e.g.
device_size = vg_size()
if device_size >= REQ_BUILDS_SIZE:
log.info('disk size: %d GB >= REQ_BUILDS_SIZE (%d GB)', device_size, REQ_BUILDS_SIZE)
Attachment #8404152 -
Flags: feedback?(catlee) → feedback+
Comment 30•11 years ago
|
||
Hi Chris,
in this patch:
* REQ_BUILDS_SIZE = 120
* removed get_swap_file() method - not used
* updated docstrings and comments
Attachment #8404152 -
Attachment is obsolete: true
Attachment #8405295 -
Flags: review?(catlee)
Reporter | ||
Updated•11 years ago
|
Attachment #8405295 -
Flags: review?(catlee) → review+
Updated•11 years ago
|
Attachment #8405295 -
Flags: checked-in+
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•