Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Connection Implementation #23282

Merged
merged 8 commits into from
Dec 14, 2023

Conversation

moulimukherjee
Copy link
Contributor

@moulimukherjee moulimukherjee commented Nov 17, 2023

Motivation

Implementation for https://github.com/MaterializeInc/database-issues/issues/6945
Depends upon https://github.com/MaterializeInc/cloud/pull/8224

Tips for reviewer

Comments inline.

Split out a PR to just accept the new environmentd cli arg #23626, in case the current PR is delayed. Will rebase once that's merged.

This PR is making backwards incompatible changes to the CREATE CONNECTION SQL for AWS connection. This has been put behind a feature flag and nobody's using it yet, so it should be fine.

Currently changes are still required on the cloud's side to pass a new AWS role arn to environmentd https://github.com/MaterializeInc/cloud/pull/8224. I am trying to get the materialize changes in before I go on a vacation next week.

I will create a separate docs PR when this goes to private preview.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:
    • No user facing changes yet. This feature has been put behind a feature flag, and nobody's using it.

cc @benesch @jubrad @alex-hunt-materialize @hlburak

@moulimukherjee moulimukherjee force-pushed the aws-connections branch 2 times, most recently from 0226f3f to 2121c83 Compare November 22, 2023 00:46
@moulimukherjee moulimukherjee force-pushed the aws-connections branch 3 times, most recently from 9b4f8fa to 8734297 Compare November 30, 2023 16:45
@moulimukherjee moulimukherjee changed the title WIP: AWS Connection AWS Connection Implementation Dec 1, 2023
Comment on lines 562 to 564
#[clap(long, env = "AWS_EXTERNAL_CONNECTION_ROLE")]
aws_external_connection_role: Option<String>,
Copy link
Contributor Author

@moulimukherjee moulimukherjee Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jubrad Using AWS_EXTERNAL_CONNECTION_ROLE to pass the new role to assume. Lmk if you would prefer a different name.

@moulimukherjee moulimukherjee marked this pull request as ready for review December 1, 2023 19:15
@moulimukherjee moulimukherjee requested a review from a team December 1, 2023 19:15
@moulimukherjee moulimukherjee requested a review from a team as a code owner December 1, 2023 19:15
@moulimukherjee moulimukherjee requested review from a team and maddyblue December 1, 2023 19:15
Copy link

shepherdlybot bot commented Dec 1, 2023

Risk Score:74 / 100 Bug Hotspots:4 Resilience Coverage:100%

Mitigations

Completing required mitigations increases Resilience Coverage.

  • (Required) Code Review 🔍 Detected
  • Feature Flag
  • Integration Test 🔍 Detected
  • Observability
  • QA Review 🔍 Detected
  • Unit Test
Bug Hotspots:

What's This?

File Percentile
../src/builtin.rs 96
../catalog/builtin_table_updates.rs 97
../src/parser.rs 97
../src/clusters.rs 91

@moulimukherjee moulimukherjee added the T-proto Theme: `$T ⇔ Proto$T` conversions and `*.proto` files label Dec 1, 2023
@moulimukherjee moulimukherjee force-pushed the aws-connections branch 3 times, most recently from 4631ad9 to c609e09 Compare December 2, 2023 00:25
@moulimukherjee moulimukherjee self-assigned this Dec 2, 2023
@jkosh44 jkosh44 requested review from jkosh44 and removed request for maddyblue December 4, 2023 16:13
Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapter/SQL changes LGTM.

src/adapter/src/catalog/config.rs Outdated Show resolved Hide resolved
src/sql/src/plan/statement/ddl/connection.rs Outdated Show resolved Hide resolved
@benesch
Copy link
Contributor

benesch commented Dec 4, 2023

Gonna take a look at this by EOD!

src/adapter/src/catalog/builtin_table_updates.rs Outdated Show resolved Hide resolved
src/adapter/src/catalog/builtin_table_updates.rs Outdated Show resolved Hide resolved
src/adapter/src/catalog/builtin_table_updates.rs Outdated Show resolved Hide resolved
src/adapter/src/catalog/builtin_table_updates.rs Outdated Show resolved Hide resolved
src/adapter/src/catalog/builtin_table_updates.rs Outdated Show resolved Hide resolved
src/sql/src/catalog.rs Outdated Show resolved Hide resolved
src/sql/src/catalog.rs Outdated Show resolved Hide resolved
src/sql/src/plan/statement/ddl/connection.rs Outdated Show resolved Hide resolved
src/adapter/src/catalog/config.rs Outdated Show resolved Hide resolved
test/testdrive/mzcompose.py Outdated Show resolved Hide resolved
@benesch benesch requested a review from sploiselle December 5, 2023 07:27
benesch added a commit to benesch/materialize that referenced this pull request Dec 5, 2023
I'd like to standardize on calling these "AWS connections" everywhere
possible, without bringing "external" into the picture. "External" came
from "external ID", but that's a standalone term of art, and I think
it's clearer to call the role here just the "AWS connection role".  (If
anything, the "external connection role" seems like the customer
supplied role, not our internal intermediary role.)

Also unplumb the argument from the coordinator, since as mentioned in
the review for MaterializeInc#23282 I think we'll want to instead plumb this argument
through the ConnectionContext.
Copy link
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed up a big batch of changes. This looks nearly mergeable. Outstanding blockers:

  1. Fixing the SQL option parsing enum
  2. Manual testing on staging

Other nice to haves:

  • Localstack tests for IAM (mis)configurations
  • Reference documentation for mz_aws_connections

src/storage-types/src/connections/aws.proto Outdated Show resolved Hide resolved
Datum::from(assume_role_session_name),
Datum::from(principal),
Datum::from(external_id.as_deref()),
Datum::from(example_trust_policy.as_ref().map(|p| p.into_element())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guswynn I golfed this a little more to get rid of the holder entirely.

@@ -231,6 +232,23 @@ pub enum PlanError {
// TODO(benesch): eventually all errors should be structured.
Unstructured(String),
}
#[derive(Clone, Debug)]
pub enum ConnectionParsingError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a separate type? We don't have any other example of splitting out PlanError within this crate.

Comment on lines 247 to 248
Self::ConflictingOptions => f.write_str("invalid CONNECTION: ASSUME ROLE ARN cannot be provided simultaneously with ACCESS KEY ID and SECRET ACCESS KEY"),
Self::MissingRequiredOptions => f.write_str("invalid CONNECTION: must specify either ASSUME ROLE ARN or both ACCESS KEY ID and SECRET ACCESS KEY"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two options have AWS specific error messages but the variants don't indicate that they are AWS specific.

@@ -28,9 +28,11 @@
NAMESPACE = ENVIRONMENT_NAME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this composition to "aws" to reflect its new purpose as a general test of AWS-related functionality.

}
}

Ok(())
}

pub(crate) fn validate_by_default(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I substantially refactored the code in this file to have more comments, produce better error messages (e.g., there were several instances of "error!" with no additional context in user-facing error messages), and to reduce duplication.

@philip-stoev philip-stoev requested review from philip-stoev and removed request for a team December 12, 2023 08:34
Copy link
Contributor

@philip-stoev philip-stoev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this introduces a new DDL syntax, it needs a misc/python/materialize/checks/all_checks/* test, to guard against Mz restarts and upgrades . Consider taking an existing test on secrets from that directory and cloning it a bit.

Copy link
Contributor

@philip-stoev philip-stoev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this introduces a new DDL syntax, it needs a misc/python/materialize/checks/all_checks/* test, to guard against Mz restarts and upgrades . Consider taking an existing test on secrets from that directory and cloning it a bit.

@guswynn
Copy link
Contributor

guswynn commented Dec 12, 2023

pushing a not-quite-finished commit to test something

Copy link
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guswynn I can't figure out why the AWS SDK isn't automatically figuring out the region? Do you have a reference for that?

It looks like maybe something got fixed in the latest version of the AWS SDK around regions and AssumeRole? Eyeballing smithy-lang/smithy-rs#3014 and it seems like it could definitely help us.

@@ -231,6 +241,9 @@ impl AwsAssumeRole {
if let Some(external_id) = external_id {
credentials = credentials.external_id(external_id);
}
if let Some(region) = region {
credentials = credentials.region(region);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And especially spooked that we need to override the region in the AssumeRole builder. This feels like something is wrong!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the first one on line 226 is required, but I agree that its cursed (perhaps fixed by the pr you linked above?)

I followed https://github.com/MaterializeInc/materialize/blob/main/src/persist/src/s3.rs#L134-L137, and based my understanding on seeing this: https://docs.rs/aws-config/0.55.1/src/aws_config/lib.rs.html#563-567

Its possible the second AssumeRoleProdider doesn't need it, do you want me to try that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed a commit removing this second one to test this, after the build finishes ill test in staging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rip, its required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't bump the AWS SDK version did you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benesch I didn't, because the next bump is from 0.x to 1.x and I don't want to block this pr on that change

@@ -341,14 +358,18 @@ impl AwsConnection {
// case of failure.
let _ = sts_client.get_caller_identity().send().await?;

let region = match &self.region {
Some(region_name) => Some(Region::new(region_name.clone())),
None => region::default_provider().region().await,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit spooked that this was necessary

Mouli Mukherjee and others added 7 commits December 13, 2023 11:18
Changes to accept either credentials or assume role

proto

implement validate

fix tests

role chaining

Fix validate connection

Added table aws_connections

cloud test

Adding more columns to mz_aws_connections from updated design

temp refactor

Fix the access_key_id_secret_id

Fixing the assume role validation

Adding testdrive tests

Added TODO

Getting materialize external connection role from environmentd

Added tests for failure scenarios

Formatting docs

python lint

update design doc

Fixing proto lint

Adding #[serde(skip)] to aws_principal_context in catalog state

Adding Deserialize trait

update comment

Fix tests
Co-authored-by: Nikhil Benesch <[email protected]>

Removing mz_principal
revert

Refactor aws connection

Moving tests to secret-aws-secret-manager

specifying column names

Refactor parsing errors

clippy fix

Added TODOs to restructure errors

revert not required changes

fix test
@guswynn
Copy link
Contributor

guswynn commented Dec 13, 2023

@philip-stoev pushed a commit with a basic check for these connections and their syntax

@benesch
Copy link
Contributor

benesch commented Dec 14, 2023

I didn't get a chance to do the AWS SDK region upgrade as I hoped, so I'm good with proceeding with this as is for now.

@guswynn guswynn added the release-blocker Critical issue that should block *any* release if not fixed label Dec 14, 2023
@guswynn guswynn enabled auto-merge (rebase) December 14, 2023 17:17
@guswynn guswynn merged commit 1f8dbd7 into MaterializeInc:main Dec 14, 2023
@moulimukherjee moulimukherjee deleted the aws-connections branch December 27, 2023 16:44
@moulimukherjee
Copy link
Contributor Author

Thanks for all the efforts and getting this merged 🙏, this would have been deep in merge conflict land otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker Critical issue that should block *any* release if not fixed T-proto Theme: `$T ⇔ Proto$T` conversions and `*.proto` files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants