-
-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-build AWS ami with Packer to minimise EC2 bootstrapping time #260
base: master
Are you sure you want to change the base?
Conversation
…packer. This reduces the number of bootstrap steps in cloud-init and saves time. Also added the instructions and config/scripts on how to build an AWS AMI with packer.
I finally looked into this (sorry!). I managed to create the AMI just fine with the provided documentation, and recorded the following runtimes with and without the packer AMI running
Though
If we have an automated builder that provides public AMIs of the latest
From what I understand, if we have a public AMI we also have to pay for storage costs. For whatever reason I can't see the private AMI I created while testing this from the EC2 Dashboard (it lists no private AMI, search gives no result for public AMI with the ID either). S3 also has transfer prices, does that mean we pay when someone uses our image, not just storage? S3 prices don't seem terribly high, but I find it hard to make an estimation of the running costs. Looking at the prices I would find it hard to imagine rising above a (few) dollar(s) a month. Am I missing anything, and/or do you have a price estimate?
I'll have a closer look tomorrow, but at first glance there looks to be nothing substantial left to move from Again apologies for not getting to this sooner, but we really do appreciate the effort! |
I've looked into the costs of an AWS AMI. When you create an AMI, an EBS snapshot is taken and stored for you. There are costs for the initial snapshot (0.05 Regarding the costs of public AMIs: You can find your created AMI's in the EC2 dashboard. The Ami's are region bound, so you have to set the AWS console to the correct region (i.e. ec-central-1/frankfurt). That's where I can see the private automl AMI's. Perhaps your console is set to a different region, which explains why you can't see the AMI's. Public AMI's are also region bound so only available to others, in the regions where you decide to create/store/publish the images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sebhrusen while we discussed some additional AMIs and restructuring the configuration file, I think we can approve this PR? We can refactor it later, but the PR is a sefl-contained first step: functional and well documented (at least good enough for me to understand :] ).
@surf-rbood thanks for the help! The additional information on costs is also very useful. Based on that I think it's reasonable for us to build the AMIs for common regions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@surf-rbood thanks for your contribution, @PGijsbers has been using it and it looks very useful.
@PGijsbers I agree that the PR looks mostly good (self-contained + documented as you mentioned).
I'd have 2 requirements to get this merged or to be fixed very soon after the merge:
- address my 2 comments with a thumbs up.
- commit the
aws_ami/config/ami-automl.pkr.hcl
file as "editable" (add to .gitignore, then force add) as it looks like this file needs to be edited by the end-user most of the time. Ideally, user should be able to have the file in its custom~/.config/automlbenchmark
folder but I don't see this supported right now, and therefore we should allow users update the repo in spite of changes made to this local file.
"BRANCH=stable", | ||
"GITREPO=https://github.com/openml/automlbenchmark", | ||
"PYV=3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can those be turned into variables?
use_packer_ami: false # if true, the EC2 instance will be started with the AMI ID of the pre build packer AMI. | ||
# Note, make sure to enter the AMI ID of your packer build image in the packer_ami field (i.e. ec2.regions.[region].packer_ami). | ||
# For more information, see the aws_ami directory. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather move this config under the aws.ec2
namespace
variable "source_ami" { | ||
type = string | ||
description = "Ubuntu Server 18.04 LTS (HVM), EBS General Purpose (SSD) VolumeType" | ||
default = "ami-0bdf93799014acdc4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this defaults to an ami available only in eu-central-1
: would be nice to have a way to automatically default to the ami defined associated to the selected region in config.yaml
.
Until then, I suggest:
default = "ami-0bdf93799014acdc4" | |
default = "<use one ami defined in config.yaml, namespace aws.ec2.regions>" |
Reference Issues/PRS
None
What does this implement/fix?
Added the option to run the automl framework with a custom AMI to minimise EC2 instance bootstrapping time.
Also, documented the process how to build this AWS AMI with Packer and added required config/scripts for the AMI build procedure.
Other comments
Things to consider/think about: