Skip to content

Python 3 base dockerfile 1.3.0

Compare
Choose a tag to compare
@themarcelor themarcelor released this 12 Aug 20:02
· 1009 commits to master since this release
ee330a9

uc-cdis/cloud-automation

New Features

  • RDS cluster autoscaling can now be enabled in terraform by just setting the
    variables to their desired value (#1362)
  • Added default encryption to rds databases. Also changed default size to
    t2.small because encryption is not available for t2.micro instances. (#1346)
  • make mariner available for dev and qa testing (#1352)
  • Move data replicate jobs from
    https://github.com/uc-cdis/dcf-datareplicate/jobs to cloud-automation
    (#1335)
  • Ignore changes if the data-upload-bucket has cors_rules, (#1331)
  • Added option to assume role and refactored code (#1327)
  • Add auspice service's .yaml file (#1319)
  • For CSOC attached commons, logs will now be sent over onto logDNA (#1324)
  • Added other run option to allow for Jenkins to get output file with
    information about the run (#1317)
  • Added cookbook to manage adminvm (#1288)
  • Added wildcard *.chef.io to squid whitelist (#1295)
  • Added bucket replication job that uses aws batch (#1294)
  • Whitelist *.census.org (#1280)
  • Added netpolicy rule for sowerjobs to reach revproxy and utilize internal
    routing (#1266)
  • aws batch job for bucket manifest generating tool (#1219)
  • COVID19 ETL jobs: add "S3_BUCKET" optional configuration variable + handle
    underscores in job names (#1252)
  • .adfs.federation.va.gov whitelisted (#1248)
  • cognito integration for SAML authentication. (#1247)
  • added mran.microsoft.com to the whitelist (#1241)
  • Selenium Hub (#1232)
  • kube-setup-seleniumhub script is TBD. (#1232)
  • Azure terraform modules. (#1226)
  • Added job (#1217)
  • Added option to replicate from different source account than adminvm (#1217)
  • new kube-setup-sower-jobs command that sets up S3 bucket, service account,
    and fine-grained IAM controls for sower jobs (#1224)
  • Added uwsgi timeout optional param to extend read-timeout for fence (#1120)
  • AdminVM module off utility VM (utility_admin) (#1208)
  • Remove old & unused jobs for covid19 etl (#1207)
  • Improve running new jobs for covid19 etl: now they will have unique names
    (#1207)
  • gen3 util for creating aws lambda function (#1189)
  • gen3 awslambda create funcname description role_arn (#1189)
  • New Ansible playbook to add a cronjob to commons user to check on terraform
    resources on daily basis and alert if there are changes outside the
    template. Would also alert if there are uncommitted changes in
    cloud-automation repo locally. (#1194)
  • Created bucket replicate script (#1186)
  • You can now choose the version you want the ElasticSearch cluster to be
    deployed on. (#1183)
  • Notebook ETL job (#1178)
  • Doc update (#1181)
  • Remove PR template, cloud-automation will use the organization one (#1179)
  • Migrated non-sensitive, externally helpful docs from cdis-wiki (#1154)
  • Added www.dph.illinois.gov to Squid whitelist (#1166)
  • Add new kubernetes job, the data-ingestion-job, which is specific to
    DataSTAGE. (#1012)
  • ETL job for Illinois Department for Public Health data (#1162)
  • Ability to deploy k8s workers on a /22 subnet, allowing more workers and
    pods in the cluster. (#1152)
  • Add COVID-19 ETL job (#1150)
  • Added keys for new bdcat cluster to squid (#1140)
  • get hostname to indexd for DRS field self_uri (#1133)
  • Added script to update ebs volumes (#1130)
  • Run WTS DB migration during "kube-setup-wts" (#1128)
  • Add empty "external_oidc" field to WTS configuration file (#1128)
  • gen3 squid info to get information about the HA-proxy instances (#1137)
  • gen3 workers-cycle to cycle a node or all nodes (#1126)
  • Switch proxy, let the stand by instance become the active one, or if the
    cluster has more than two instance, a single one will be picked up
    (different from the current instance) as active. (#1125)
  • RDS module now creates an Option Group by default that you assign to the
    instance for backing up against s3 (#1119)
  • gen3 secrets rotate postgres indexd|sheepdog|fence (#1114)
  • kube-dev-namespace sets up new db users for indexd, sheepdog, and fence
    db's (#1114)
  • added fence ssh keys from internalanvil to squid (#1115)
  • Setup sower job for indexd_utils (#1066)
  • AWS inspec implementation for the security team. (#1112)
  • added qa-dcf key to squid (#1109)
  • metadata service automation (#1087)
  • Remediate CIS issues with Amazon Linux workers (#1094)
  • Single squid instance type is a variable. (#1092)
  • HA squid (#1046)
  • add OWASP rules to default modsecurity configuration (#1082)
  • ability to run gen3 commands remotely using adminVMs as proxy (#1072)
  • EX: (#1072)
  • ssh cdistest.csoc -C "~/cloud-automation/files/script/remote-gen3.sh

    kube-setup-revproxy (#1072)
  • ansible a-hosts -m shell -a "cloud-automation/files/script/remote-gen3.sh

    kube-setup-revproxy (#1072)
  • implement gen3 cmd for creating gs bucket for data refresh (#1060)
  • Networkpolicy fixes from VA: Kubernetes YAML syntax fix (#1049)

Dependency Updates

Deployment Changes

Breaking Changes

  • The codecept.conf.js script in gen3-qa needs to be adjusted
    accordingly. (#1232)

Improvements

  • enable sa-iam-role stuff by default in terraform eks (#1360)
  • bump k8s deployments to apps/v1 apiVersion (#1360)
  • basic workspace-parent account landing page, and couple hooks to bypass
    portal when portal_app set to GEN3-WORKSPACE-PARENT (#1360)
  • gen3 logs history byuser command to get list of top 100 users over some
    time range (#1360)
  • revproxy tweak handling of Strict-Transport-Security header - there's a
    covid19 security scan jira someplace (#1360)
  • apply cloud-automation filter for image tag for mariner deployment (#1354)
  • add mariner to revproxy (#1352)
  • update mariner deployment with awsusercreds secret and jwks endpoint for
    fence (#1352)
  • delete old unused mariner config json (#1352)
  • Fluentd 1.10.2 support for logging. (#1337)
  • log stream are now named after the pod they are running on. (#1337)
  • kubernetes workers logs (auth, message, docker), are now sent over to
    CloudWatchLogGroups (#1337)
  • kube-system namespace logs are now skipped, not sent to CloudWatchLogGroups
    (#1337)
  • Added descriptors to terraform files to allow infrastructure to be easily
    identified/cleaned up. Also added clean all to cleanup all tagged
    infrastructure. (#1341)
  • Setup ssjdispatcher and its jobs with IAM-linked service accounts. (#1339)
  • Auto-create ssjdispatcher upload bucket, sns, and sqs. (#1339)
  • Simplify g3kubectl to work with new aws-iam-authenticator - things like
    watch kubectl get pods or echo bla | xargs kubectl will work now (#1339)
  • add visitdata.org to squid whitelist to allow fetching additional mobility
    data for covid-19 bayes model (#1334)
  • add gstatic.com to squid_wildcard_whitelist to allow fetch google mobility
    data (#1332)
  • increase resources allocated to fenceshib deployment (#1328)
  • switch from old heptio-authenticator-aws to recent
    aws-iam-authenticator (#1325)
  • configure aws ntp on ec2 admin vm's:
    https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html (#1322)
  • documentation update (#1318)
  • ansible playbook update (#1318)
  • Only remove the terraform dir if gen3_terraform destroy is success (#1320)
  • Better job status check (#1320)
  • typo correction (#1316)
  • new gen3 es garbage helper (#1310)
  • new gen3 job cron helper (#1310)
  • fix for gen3 jupyter idle (#1310)
  • add backoffLimit to some batch jobs (#1310)
  • increase memory allocation for nb-etl job so it doesn't get killed OOM by
    k8s (#1298)
  • run bayes model for IL-by-county twice as fast (#1293)
  • gen3 roll all --fast (#1291)
  • gen3 roll all calls out to gen3 dashboard gitops-sync (#1291)
  • gen3 awsrole extended to support sa-linked roles (#1291)
  • adminvm terraform to use ubuntu18 ami (#1291)
  • gen3 api hostname, environment, namespace, and safe-name helpers
    (#1291)
  • gitops-sync jobs extended with sa linked to an iam role - gen3 dashboard gitops-sync should work (#1291)
  • add sts vpc endpoint to eks terraform (#1291)
  • Runs notebook etl in the morning (10am Chicago time, server time is in
    UTC), instead of running it every 3 hours (#1292)
  • Increase Guppy wait time for ES proxy to 250s (50 cycles) (#1290)
  • schedule nb-etl pods to jupyter nodes so as to not compete with gen3
    services for cluster resources (#1287)
  • COVID19 "nb-etl": do not overwrite "command" and "args" from Dockerfile
    (#1284)
  • Create temp bucket for storing the output manifest (#1273)
  • Support provide authz for merging (#1273)
  • add gen3 shutdown namespace helper and cron jobs (#1272)
  • add gen3 api hostname helper (#1272)
  • patch kube-setup-aws-es-proxy to not rely on $vpc_name (#1272)
  • patch gen3 ebs snapshot to copy volume tags to snapshot (#1272)
  • add kube-setup-metadata to gen3 roll all (#1270)
  • Add some helpers: (#1269)
  • gen3 api sower-run commandFile.json $apiKey (#1269)
  • gen3 api sower-template pfb (#1269)
  • gen3 prometheus query $query $apiKey (#1269)
  • gen3 prometheus list $apiKey (#1269)
  • gen3 prometheus curl $urlBase $apiKey (#1269)
  • introduce gen3 ecr helpers for interacting with ECR docker image
    repositories. (#1265)
  • Add rstudio.com to whitelist (#1264)
  • Added .dph.illinois.gov to Squid whitelist, removed www.dph.illinois.gov
    (#1259)
  • COVID19 ETL: add ETL name to slack notification (#1256)
  • klock handles strange lock owners better (#1255)
  • setup batch jobs to reap idle hatchery app pods (12 hours of no traffic)
    (#1254)
  • Update IAM SA doc: SAs can be created using a policy ARN, not a policy name
    (#1252)
  • awshelper image updates - ubuntu18, aws-cli2 (#1246)
  • deploy awshelper in Jenkins (#1246)
  • ambassador expose metrics to prometheus (#1246)
  • jupyter idle helper (#1246)
  • logs in CWL now uses the date as prefix (#1245)
  • remove set -i from batch jobs - deprecated in bash 4.4 (#1244)
  • update awshelper to package cloud-automation/ code package (#1244)
  • introduce GEN3_AWSHELPER_IMAGE variable to simplify testing (#1244)
  • trim bucket name down to 50 chars so there's room for prefixes down the
    line (#1239)
  • Support for the latest version of fluentd. (#1238)
  • New fluentd configuration would also push worker nodes messages and
    secure logs. (#1238)
  • Tagging the logs in a simpler way for better visibility and manipulation
    through cloudwatch. (#1238)
  • Moving fluentd to its own namespace to clear out the kube-system namespace
    (#1238)
  • This should address the flakyness around Selenium/Webdriver operations:
    (#1232)
  • To get rid of errors such as (#1232)
  • chrome not reachable (Session info: headless chrome=70.0.3538.77) (Driver
    info: chromedriver=2.43.600233 (#1232)

  • gen3 api helper works with api keys (#1236)
  • kube-setup-sftp saves crypto keys to secrets folder (#1236)
  • kube-setup-hatchery harvests configmaps (#1236)
  • refactored code (#1217)
  • update to new lts ubuntu version in awshelper dockerfile and other cleanup
    (#1224)
  • Run the squid auto module by its own if required at any point in time.
    (#1221)
  • Qualys VM module updated and simplified. (#1222)
  • Hardcoded regions removed. (#1222)
  • VPN server can now be deployed on ubuntu18. (#1215)
  • Removed hardcoded references to planx. (#1215)
  • Module can now be deployed on any account besides CSOC. (#1215)
  • propagate exit code in usersync and etl jobs (#1205)
  • roll-all non-zero exit code if cluster not healthy at end (#1198)
  • reset non-zero exit code if roll-all fails (#1198)
  • jenkins cronjob cleanup old builds (#1198)
  • gen3 s3 retry tfplan/tfapply if necessary (#1198)
  • add squid wildcard test (#1198)
  • klock truncate owner to 45 characters (#1198)
  • slack message for covid19-etl-*-job (#1199)
  • documentation (#1196)
  • Ansible hosts updated (#1197)
  • Ansible hosts files, added new hosts, and switched to hostname instead of
    IPs for a bunch of them. (#1194)
  • Lambda function documentation. (#1191)
  • install terraform12 (#1184)
  • add gen3 es create index-name mapping.json (#1184)
  • add gen3 secrets gcp ... for service key rotation (#1184)
  • add gcp.md with instructions for GCP integration (#1184)
  • update npm dependencies (npm audit fix) (#1184)
  • patch gen3 s3 ... and gen3 aws* to not call gen3 trash - leave
    terraform workspace in place (#1184)
  • remove deprecated flag in gen3 devterm (#1184)
  • Commons documentation updated. A few typos fixed. (#1187)
  • Documentation (#1183)
  • support gen3 workon profile whatever__data_bucket_queue (#1177)
  • kubectl node drain commands now have the --force flag for draining
    deleting "Pods not managed by ReplicationController, ReplicaSet, Job,
    DaemonSet or StatefulSet" like hatchery pods. (#1175)
  • gen3 bootstrap subcommand (#1173)
  • introduce gen3 logs cloudwatch ... subcommands (#1160)
  • Restructure the ETL for COVID-19 Data Commons (#1168)
  • by using aws autoscaling terminate-instance-in-auto-scaling-group instead
    of aws ec2 stop-instances we are letting the ASG that we want to keep the
    desired state so it tries to spin up a new instance right away. (#1169)
  • aws-es-proxy new release ready (#1167)
  • Improve the naming for jobs for COVID-19 Data Commons (#1163)
  • Update the default Gen3 logo to new version (#1161)
  • Network documentation. (#1158)
  • Fix deployment step for COVID-19 adminVM cronjob (#1156)
  • Commons module documentation. (#1152)
  • Change comment (#1153)
  • enable encryption in all storage classes (#1149)
  • disable TLSv1 in old docker images (#1148)
  • service_releases cookie http-only, secure (#1148)
  • Documentation (#1143)
  • expand hatchery config for dockstore apps (#1142)
  • allow workspaces to communicate directly with manifest service and revproxy
    (#1142)
  • first cut at path-list to manifest dashboard app - ex:
    https://reuben.planx-pla.net/dashboard/Public/paths-to-manifest/index.html
    (#1135)
  • first cut at links-page dashboard app - ex:
    https://reuben.planx-pla.net/dashboard/Public/marcello20191126/data/index.html
    (#1135)
  • roll dashboard in roll-all (#1131)
  • delete all pods in reset (#1131)
  • revproxy modsec WAF rules pruned - can pass basic QA with enforcement
    enabled, but enforcement not yet enabled (#1123)
  • revproxy /_status endpoint and http liveness probe (#1123)
  • revproxy nginx client_buffer_size to 16k, error log to warn level, set
    fallback values on some log variables (#1123)
  • some basic waf documentation (#1123)
  • files/scripts/braincommons - for generating various dream challenge access
    reports (#1118)
  • allow access to port 3128 from the peering connection (#1113)
  • update k8s scheduling resource requests and limits (#1099)
  • Removed a few hardcoded tags (#1098)
  • kube-setup-fence runs migrate job, so we can disable db migration in
    fence-config-public.yaml in all our commons if we want to ... (#1097)
  • disable migration in presigned-url deployment (#1097)
  • The instance product of this module has now deploy_before_destroy
    enabled. (#1092)
  • gen3OnK8s.md docs - more details on setting up gen3 on a k8s cluster (#1089)
  • kube-dev-namespace - setup secrets in Gen3Secrets/ (#1089)
  • Update Jenkins to 2.219 with latest changes and
    security updates (#1083)
  • Addressing PXP-5001 (#1074)
  • kube-setup-autoscaler does not have version fully hardcoded. Versions can
    now be passed along with script. (#1069)
  • If no version is provided, it'll default to the ones preloaded. (#1069)

Bug Fixes

  • Added wildcard .southsideweekly.com to allow grabbing the CHI-NBHD
    dataset for PRC (#1347)
  • Fixed terraform and updated bucket-manifest script (#1343)
  • Use the right directory (#1340)
  • Fix the container image names (#1336)
  • fix the job definition name (#1326)
  • Removed verify bucket access job from kube-setup-google because the job is
    buggy and removes access it shouldn't (#1309)
  • Replacing kubectl with g3kubectl, (#1289)
  • up resource limits on nb-etl job to prevent getting killed by k8s (#1287)
  • fix handling of squid IP lookup for external network policy on admin vm
    (#1278)
  • use empty string instead of null (#1276)
  • Added the anvil to squid whitelist (#1274)
  • Truncates iam role, due to aws role character limit of 64 (#1267)
  • Fix typos (#1260)
  • use correct secret name for job config (#1243)
  • Combine type logs wouldn't make it to CWL because of a typo. This has been
    corrected. (#1245)
  • policy has correct resources based on new bucket naming convention (#1239)
  • When fluentd was set to use the latest version, the configmap created off
    the configuration file would have the wrong name which ultimately resulted
    in crashed pods. (#1240)
  • The file name is created temporarily prior the application with the right
    name to avoid this issue. (#1240)
  • Fix for issue with policy and role names longer than 64 characters when
    running the s3-bucket module. (#1228)
  • Lambda function for logging parsing would fail if a variable has no value.
    To fix this, we have set a default value to avoid having to set that up
    manually on every environment. (#1235)
  • don't error when aws role names are above 64 char limit (slice to limit)
    (#1234)
  • won't error for envs with long host names (b/c of s3 bucket name char
    limit) (#1233)
  • Revert awshelper image to unbreak jobs (#1227)
  • handle errors better in iam-serviceaccounts (#1224)
  • Nat gateway data resource look up sometimes fails if it finds another
    resource with the same search criteria. For example you made a mistake
    applying a plan and have to rebuild again, the gateway might get deleted
    but not removed. Adding an additional filter solves this problem. (#1209)
  • presigned-url deployment works with fence 2.7 (#1205)
  • Report tool would fail if kubeconfig file is located at a different folder
    than ${HOME}/${vpc_name}/kubeconfig. It is now loaded from the location
    returned by gen3. (#1204)
  • monitor-pod would sometimes hang when launching the monitor pod. (#1195)
  • The lambda function module lacked of additional outputs. It'll now output
    whichever option you choose, with vpc and without vpc. If any is not
    deployed it'll simply not output it. (#1191)
  • You can now tell the ES module to deploy or not a linked role. This is
    problematic when there is more than one commons using ES in a single
    account. (#1183)
  • The rate limit config had been accidentally disabled when the command
    override was introduced to this deployment (which suppresses the parent
    image's entrypoint.sh execution). Adding an extra call to this script to
    make sure the NGINX_RATE_LIMIT check is executed. (#1182)
  • Using gen3 iam-serviceaccount results in null in service-account
    annotation. This is a fix for the issue. (#1180)
  • gen3 iam-serviceaccount produces wrong link (contains https part) in
    Federated part of trust policy. Fixed now. (#1180)
  • Name of the environmental variable (#1171)
  • Docker image name (#1171)
  • Fix the name of COVID-19 job (#1170)
  • When creating databases, the script uses the current user name. When the
    user has hyphens in the name, like iam-anuser, postgres doesn't like it and
    the script bails. (#1164)
  • This patch would replace hyphens for underscores before creating databases
    when db_restoreing. (#1164)
  • google cron jobs check webhook validity (#1155)
  • When listing hosted zones, if one had no comments, the scripts would fail
    without completing any change on the hosted zone for cloud-proxy. (#1151)
  • handling of external requests to port 80 by the reverse proxy and network
    policy (#1148)
  • eks workers init script had a duplicated kubectl apply -f on the same
    line, breaking the application of the file that deploys calico. (#1147)
  • fix regression creating peregrine postgres user in new commons (#1138)
  • vpc module tag for the actual vpc changed a few ago, causing mayor issues
    when deploying new commons, bringing it back to how it was before. (#1132)
  • Hardcoded commons name removed (#1129)
  • Switched the domain used to check for internet access from the VPC where
    commons' kubernetes workers live. The previous default one returned 302
    (redirection) codes forcing the check to switch the active instances every
    minute. www.google.com return 200 which is acceptable as a working internet
    connection. (#1124)
  • leftover option turning into invalid when applying (#1122)
  • duplicated function name in lambda for check ups against the proxies.
    (#1113)
  • Modified fence db migrate job to disable/wait for usersync to finish (#1107)
  • disable db-migrate job in kube-setup-fence (#1110)
  • db restore create restore db as server owner and grant perms to service
    user (#1108)
  • kube-dev-namespace secrets folder fix (#1105)
  • squid auto bootstrap had a reference for a branch. (#1100)
  • Fixed an issue which is causing DREAM access report to have empty fields
    (#1076)
  • Add list of namespaces to Jenkins (#1068)
  • Enable Veracode domains through squid proxy (#1065)