Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Configuration incorrectly infers hosts to be serverless or not #729

Closed
2 tasks done
huttotw opened this issue Mar 14, 2024 · 4 comments
Closed
2 tasks done
Labels
feature:ra3-node issues relating to ra3 node support triage:product type:bug Something isn't working

Comments

@huttotw
Copy link

huttotw commented Mar 14, 2024

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Today, when method: iam is used, redshift_connector get's confused if there is a non-standard hostname (SSH tunnel, CNAME, etc) and always tries to authenticate to a provisioned cluster by calling redshift:GetClusterCredentials even if the host points to a serverless cluster.

Expected Behavior

dbt-redshift should pass the explicit option is_serverless through to the redshift_connector package so that the correct API is used to authorized the user. Ideally, the format of the host should be irrelevant since it can take many forms in a lot of different set ups.

Steps To Reproduce

  1. Create a redshift serverless cluster
  2. Create a bastion host (EC2 instance) that will be used as an SSH tunnel to your redshift cluster
  3. Start an SSH tunnel that port-forwards :5439 on your local machine to :5439
  4. Set host to localhost in your profiles.yml
  5. Run dbt debug
  6. Observe that it fails because it is trying to authenticate to a redshift provisioned cluster.

Relevant log output

% dbt debug --target redshift
15:05:16  Running with dbt=1.7.9
15:05:16  dbt version: 1.7.9
15:05:16  python version: 3.11.8
15:05:16  python path: /Users/huttotw/src/dbt/.dbtenv/bin/python3.11
15:05:16  os info: macOS-14.3.1-arm64-arm-64bit
15:05:16  Using profiles dir at /Users/huttotw/.dbt
15:05:16  Using profiles.yml file at /Users/huttotw/.dbt/profiles.yml
15:05:16  Using dbt_project.yml file at /Users/huttotw/src/dbt/integration_tests/dbt_project.yml
15:05:16  adapter type: redshift
15:05:16  adapter version: 1.7.4
15:05:16  Configuration:
15:05:16    profiles.yml file [OK found and valid]
15:05:16    dbt_project.yml file [OK found and valid]
15:05:16  Required dependencies:
15:05:16   - git [OK found]

15:05:16  Connection:
15:05:16    host: localhost
15:05:16    user: serverlessuser
15:05:16    port: 5439
15:05:16    database: dev
15:05:16    method: iam
15:05:16    cluster_id: test
15:05:16    iam_profile: test
15:05:16    schema: test
15:05:16    sslmode: prefer
15:05:16    region: us-east-1
15:05:16    sslmode: prefer
15:05:16    region: us-east-1
15:05:16    iam_profile: test
15:05:16    autocreate: False
15:05:16    db_groups: []
15:05:16    ra3_node: False
15:05:16    connect_timeout: 30
15:05:16    role: None
15:05:16    retries: 1
15:05:16    autocommit: True
15:05:16  Registered adapter: redshift=1.7.4
15:05:17    Connection test: [ERROR]

15:05:17  1 check failed:
15:05:17  dbt was unable to connect to the specified database.
The database returned the following error:

  >Database Error
  An error occurred (ClusterNotFound) when calling the GetClusterCredentials operation: Cluster test not found.

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile

Environment

- OS: macOS-14.3.1-arm64-arm-64bit
- Python: 3.11.8
- dbt-core: 1.7.9
- dbt-redshift: 1.7.4

Additional Context

The Redshift serverless cluster I'm trying to connect to is private and only accessible via an SSH tunnel.

@dataders
Copy link
Contributor

hey @huttotw, sorry you're having trouble connecting.

dbt-redshift should pass the explicit option is_serverless

Am I correct in my understanding that is_serverless would need to be user-provided? We can totally do this, but I'm not sure if your meaning was that dbt-redshift might somehow be able to infer that the cluster is serverless based on the CNAME

@dataders dataders added triage:awaiting-response feature:ra3-node issues relating to ra3 node support labels Mar 25, 2024
@jiezhen-chen
Copy link
Contributor

jiezhen-chen commented Apr 3, 2024

Looks like dbt-redshift is currently inferring the is_serverless param in redshift_connector by checking if serverless is present in the host, see code here. @huttotw You're right to point out that the current connection mechanism won't be able to detect that this is a serverless endpoint. Have you tried adding is_serverless = True in the connection call here?

Also, (you've probably already done this but just wanted to check) are you able to psql into Redshift Serverless from your EC2 instance?

@Fleid
Copy link
Contributor

Fleid commented Jun 13, 2024

This should have been solved via #835
To be tested via an install from source, should land in 1.9

@CamposContentful
Copy link

Hi, sorry for digging in this closed issue, but I'm facing the same problem when I point DBT to my serverless Redshift instance through a ssh tunnel. Although, in our case our host value ("localhost") is ignored an the error 'host' must be provided for serverless endpoint is raised.

The fix in #835 doesn't seem to solve this problem, because we still need to have the substring "serverless" in the value of our host.

Are there any possible solutions in this use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:ra3-node issues relating to ra3 node support triage:product type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants